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Foreword 


The ARM architecture is the basis of the world's most widely available 32-bit microprocessor. 


ARM Powered microprocessors are being routinely designed into a wider range of products than any other 
32-bit processor. This diversity of applicability is made possible by the ARM architecture, resulting in 
optimal system solutions at the crossroads of high performance, low power consumption, and low cost. 


In November 1990, ARM was formed to develop and promote the ARM architecture. With initial investment 
from: 


« Apple Computer, the world's third largest manufacturer of personal computers 
« Acorn, the United Kindom's leading supplier of Information Technology for education 
¢« VLSI Technology, the world leading ASIC supplier 


ARM Ltd. began work to establish the ARM architecture as the 32-bit standard microprocessor 
architecture. 


Initially, ARM devices were made by VLSI Technology, but ARM's business model is to license 
microprocessor designs to a broad range of semiconductor manufacturers, allowing focus on a variety of 
end-user applications. To date, this licensing strategy has resulted in twelve companies manufacturing 
ARM-based designs: 


VLSI Technology GEC Plessey Semiconductors (GPS) Sharp Corporation 

Texas Instruments (Tl) Cirrus Logic Asahi Kasai Microsystems (AKM) 
Samsung Corporation Digital Equipment Corporation European Silicon Structures (ES2) 
NEC Corporation Symbios Logic Lucky Goldstar Coporation 


Through these twelve semiconductor partners, the ARM processor is currently being used in a wider range 
of applications than any other 32-bit microprocessor. ARM chooses new partners carefully; each new 
partner is judged on their ability to extend the applicability of ARM processors into new market areas, 
broadening the total ARM product range by adding their unique expertise. 


Customers using ARM processors benefit not only from the ARM architecture, but the ability to select the 
most appropriate silicon manufacturer. Furthermore, the worldwide awareness of the ARM processor has 
attracted third-party developers to the ARM architecture, yielding a huge variety of ARM support products: 


ISI, Microtec, Accelerated Technology, Cygnus, Eonic Systems and Perihelion all offer 
Operating Systems for the ARM. 


Yokogawa Digital and Lauterbach provide In-Circuit Emulators (ICE). 
Hewlett Packard Logic analysers support ARM processors. 


In ARM Ltd's five-year history, it has delivered over 30 unique microprocessor and support-chip designs. 
ARM microprocessors are routinely being designed into hundreds of products including: 


cellular telephones interactive game consoles 
organisers disk drives 

modems high performance workstations 
graphics accelerators car navigation systems 

video phones digital broadcast set-top decoders 
cameras smart cards 

telephone switchboards laser printers 
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Foreword 


This product range includes microprocessors designed as macrocell cores, with around 4mm of silicon, 
drawing less than fifty milliWatts of power when executing over 30 MIPS of sustained performance. For 
very high performance applications, ARM implementations deliver over 200 MIPS sustained performance 
(more than most high-performance computer workstations), while still only consuming one half of one 
Watt. 


No other processor architecture can offer implementations in these extremes; no other processor has 
ARM's broad applicability across an entire product range. 


The ARM Architecture Reference Manual is the definitive description of the programmers’ model of all 
ARM microprocessors, and is ARM's commitment to users of the ARM processor for compatibility and 
interworking across products designed and manufactured by many different companies. 


Through this commitment, the benefits of an ARM processor can be harnessed to establish and maintain 
an industry lead that only ARM Powered products can achieve. 


Dave Jaggar. July 1996 
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Preface 


This preface describes the ARM Architecture Version 4, also known as ARMv4, 
and lists the conventions and terminology used in this manual. 


Introduction x 
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Preface 


The ARM Architecture exists in 4 major versions: 


Introduction 
Version 1 
Version 2 
Version 3 
Version 4 
X 


was implemented only by ARM1, and was never used in a 
commercial product. 


extended Version 1 by adding a multiply instruction, multiply 
accumulate instruction, coprocessor support, two more banked 
registers for FIQ mode and later (Version 2a) an Atomic Load and 
Store instruction (called SWP) and the use of Coprocessor 15 as 

a system control coprocessor. Version 1, 2 and 2a all support a 26-bit 
address bus and combine in register 15 a 24-bit Program Counter 
(PC) and 8 bits of processor status. 


Version 2 has just three privileged processing modes, Supervisor, 
IRQ and FIQ. 


extended the addressing range to 32 bits, defining just a 30-bit 
Program Counter value in register 15, and added a separate 11-bit 
status register (the Current Program Status Register or CPSR) 
to contain the status information that was previously in register 15. 


Version 3 added two new privileged processing modes to allow 
coprocessor emulation and virtual memory support in supervisor 
mode: 


. Undefined 
¢ — Abort 


Five more status registers (the Saved Program Status Registers, 
SPSRs) were defined, one for each privileged processor mode, 
to preserve the CPSR contents when the exception occurs 
corresponding to each privileged processor mode. 


Version 3 supports both hardware and software emulation of Version 
2a. The processor can be forced to be Version 2a only in hardware, 
allowing full backwards compatibility, but no version 3 advantages. 
Version 3 can also be switched in software into a Version 2 execution 
model to support Version 2a on a process by process basis to allow 
a smooth upgrade path from Version 2 to Version 3. See Chapter 5, 
The 26-bit Architectures for information on the differences between 
the 26-bit architectures (version1, 2 and 2a) and the 32-bit 
architectures (Version 3, 3M, 4 and 4T). 


Version 3G__is the same as version 3, without the backwards 
compatibility support for version 2 architectures. 


Version 3M _ adds signed and unsigned multiply and multiply 
accumulate instructions that produce a 64-bit result to 
Version 3. The new ARMvS3M instructions are marked 
as such in the text. 

adds halfword load and store instructions, sign-extending byte and 

halfword load instructions, adds a new privileged processor mode 

(that uses the user mode registers) and defines several new 

undefined instructions. 


Version 4T incorporates an instruction decoder for a 16-bit subset 
of the ARM instruction set (called THUMB). 
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Using this Manual 


The information in this manual is organised as follows: 


Chapier 1 gives an overview of the ARM architecture 
Chapter 2 describes the programmer’s model 
Chapter 3 lists and describes the ARM instruction set 
Chapter 4 gives examples of coding algorithms 
Chapter 5 describes the differences between 32-bit and 26-bit 
architectures 
Chapier 6 lists and describes the Thumb instruction set 
Chapter 7 describes ARM system architecture, and the system control 
processor 
Conventions 
This manual employs typographic conventions intended to improve its ease of use. 
code a monospaced typewriter font shows code which you need to enter, 
or code which is provided as an example 
Terminology 


This manual uses the following terminology: 


Abort 
is caused by an illegal memory access. Aborts can be caused by the external 
memory system or the MMU 


AND 
performs a bitwise AND 


Arithmetic_Shift_Right 
performs a right shift, repeatedly inserting the original left-most bit (the sign bit) in 
the vacated bit positions on the left 


ARM instruction 
is a word that is word-aligned 


assert statements 
are used to ensure a certain condition has been met 


Assignment 
is signified by = 


Binary numbers 
are preceded by Ob 


Boolean AND 
is signified by the “and” operator 
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Boolean OR 
is signified by the “or” operator 


BorrowFrom 
returns 1 if the following subtract causes a borrow (the true unsigned 
result is less than 0) 
returns 0 in all other cases 
Byte 
is an 8-bit data item 
CarryFrom 
returns 1 if the following addition cayges a carry 
(the result is bigger than 2 -1) 
returns 0 in all other cases 


case ... endcase statements 
are used to indicate a one of many execution option. Indentation indicates the range 
of statements in each option 


Comments 
are enclosed in /* */ 


ConditionPassed(cond) 


returns true if the state of the N, Z, C and Z flags fulfils the condition encoded 
in the cond argument 
returns false in all other cases 
CPSR 
is the Current Program Status Register 
CurrentModeHasSPSR() 
returns true if the current processor mode is not User mode or System mode 
returns false if the current mode is User mode or System mode 


Do-not-modify fields (DNM) 
means the value must not be altered by software. DNM fields read as 
UNPREDICTABLE values, and may only be written with the same value read from the 
same field as the same processor. 


Elements 
are separated by | in a list of possible values for a variable 


EOR 
performs an Exclusive OR 


Exception 
is a mechanism to handle an event; for example, an external interrupt or an 
undefined instruction 


External abort 
is an abort that is generated by the external memory system 
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Fault 
is an abort that is generated by the MMU 


General-purpose register 
is one of the 32-bit general-purpose integer registers, RO to R14 


Halfword 
is a 16-bit data item 


Hexadecimal numbers 
are preceded by Ox 


if ... else if ... else statements 
are used to signify conditional execution. Indentation indicates the range of 
statements in each condition 


IGNORE fields (IGN) 
must ignore writes 


Immediate and offset fields 
are unsigned unless otherwise stated 


IMPLEMENTATION-DEPENDENT fields (IMP) 
are not architecturally specified, but must be defined and documented by individual 
implementations 


InAPrivilegedMode() 
returns true if the current processor mode is not User mode; 


returns false if the current mode is User mode 


Logical_Shift_Left 
performs a left shift, inserting zeroes in the vacated bit positions on the right. 
<< is used as a short form for Logical_Shift_Left 


Logical_Shift_Right 
performs a right shift, inserting zeroes in the vacated bit positions on the left 


LR (Link Register) 
is integer register R14 


Memory[<address>,<size>] 
refers to a data item in memory of length <size>, at address <address>, aligned on 
a <size> byte boundary. 
The data item is zero-extended to 32 bits. 
Currently defined sizes are: 
1 for bytes 
2 for halfwords 
4 for words 


To align on a <size> boundary, halfword accesses ignore address[0] and word 
accesses ignore address|[1:0] 
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NOT 
performs a Complement 


NotFinished(CP_number) 


returns true if the coprocessor signified by the CP_number argument has 
signalled that the current operation is complete 
returns false in all other cases 


NumberOfSetBitsIn (bitfield) 
performs a population count on (counts the set bits in) the bitfield argument 


Object[from:to] 
indicates the bit field extracted from Object starting at bit “from”, ending with bit “to” 
(inclusive) 


Optional parts of instructions 
are surrounded by { and } 


OR 
performs an Inclusive OR 


OverflowFrom 
returns 1 if the following addition (or subtraction) causes a carry (or borrow) to 
(from) bit[31]. Addition generates a carry if both operands have the same sign 
(bit[31]), and the sign of the result is different to the sign of both operands. 
Subtraction causes an overflow if the operands have different signs, and the first 
operand and the result have different signs 


PC (Program Counter) 
is integer register R15 (or bits[25:2] of register 15 on 26-bit architectures) 


PSR 
is the CPSR or one of the SPSRs (or bits[31:26] and bits[1:0] of register 15 on 26-bit 
architectures) 


Read-as-zero fields (RAZ) 
appear as zero when read 


Read-Modify-Write fields (RMw) 
should be read to a general-purpose register, the relevant fields updated in 
the register, and the register value written back. 


Rotate_Right 
performs a right rotate, where each bit that is shifted off the right is inserted on 
the left 


Security hole 
is an illegal mechanism that bypasses system protection 


Should-be-one fields (SBO) 
should be written as one (or all 1s for bit fields) by software. 
Values other than one values produce unpredictable results 
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Should-be-one-or-preserved fields (SBOP) 
should be written as one (or all 1s for bit fields) or preserved by writing the same 
value that has been previously read from the same fields on the same processor. 


Should-be-zero fields (SBZ) 
should be written as zero (or all Os for bit fields) by software. 
Non-zero values produce unpredictable results 


Should-be-zero-or-preserved fields (SBZP) 
should be written as zero (or all Os for bit fields) or preserved by writing the same 
value that has been previously read from the same fields on the same processor. 


Signed immediate and offset fields 
are encoded in two’s complement notation unless otherwise stated 


SignExtend(arg) 
sign-extends (propagates the sign bit) its argument to 32 bits 


SPSR 
is the Saved Program Status Register 


Test for equality 
is signified by == 


THUMB instruction 
is a halfword that is halfword-aligned 


Unaffected items 
are not changed by a particular operation 


UNDEFINED 
indicates an instruction that generates an undefined instruction trap. See 2.5 
Exceptions on page 2-6 for information on undefined instruction traps 


UNPREDICTABLE 
means the result of an instruction cannot be relied upon. 
unpredictable instructions or results must not represent security holes. 
UNPREDICTABLE instructions must not halt or hang the processor, or any parts of 
the system 


UNPREDICTABLE fields (UNP) 
do not contain valid data, and a value may vary from moment to moment, 
instruction to instruction, and implementation to implementation 


Variable name 
(a symbolic name for values) is surrounded by < and > 


while .... statements 
are used to indicate a loop. Indentation indicates the range of statements in 
the loop 


Word 
is a 32-bit data item 
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The ARM architecture has been designed to allow very small, yet high-performance 
implementations. It is the architectural simplicity of ARM which leads to very small 
implementations, and small implementations allow devices with very low power 


consumption. 
dl Overview 1-2 
1.2 Exceptions 1-3 
1.3 ARM Instruction Set 1-4 
1.4 Branch Instructions 1-4 
1.5 Data-processing Instructions 1-4 
1.6 Load and Store Instructions 1-5 
1.7 Coprocessor Instructions 1-6 
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Overview 


ARM registers 


The ARM is a RISC (Reduced Instruction Set Computer), as it incorporates all 
the features of a typical RISC architecture: 


* alarge uniform register file 

* aload-store architecture (data-processing operations only operate on register 
contents) 

* — simple addressing modes (data loaded and stored from an address specified 
in registers and instruction fields) 

* — uniform and fixed length instruction fields (which simplify instruction decode) 

In addition, the ARM architecture provides these features: 

* — control over both the ALU and shifter in every data-processing instruction 
to maximise the use of a shifter and an ALU 

* — auto-increment and auto-decrement addressing modes to optimise program 
loops 

* load and store multiple instructions to maximise data throughput 

* conditional execution of all instructions to maximise execution throughput 


Together, these architectural enhancements to a basic RISC architecture allow 
implementations that can balance high performance, low power consumption and 
minimal die size in every implementation. 


ARM has thirty-one, 32-bit registers. At any one time, sixteen are visible; the other 
registers are used to speed up exception processing. All the register specifiers in ARM 
instructions can address any of the 16 registers. 


The main bank of sisteen registers is used by all non-privileged code; these are the User 
mode registers. User mode is different from all other modes, as it is non-privileged, 
which means that user mode is the only mode which cannot switch to another processor 
mode (without generating an exception). 


Program counter 


Register 15 is the Program Counter (or PC), and can be used in most instructions as 
a pointer to the instruction which is two instructions after the instruction being executed. 
All ARM instructions are 4 bytes long (one 32-bit word), and are always aligned on 

a word boundary, so the PC contains just 30 bits; the bottom two bits are always zero. 


Link register 


Register 14 is called the Link Register (or LR). Its special purpose is to hold the address 
of the next instruction after a Branch with Link (BL) instruction, which is the instruction 
used to make a subroutine call. At all other times, R14 can be used as a 
general-purpose register. 


Other registers 


The remaining 14 registers have no special hardware purpose - their uses are defined 
purely by software. Software will normally use R13 as a stack pointer (or SP). 
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Exceptions 


Architecture Overview 


ARM supports 5 types of exception, and a privileged processing mode for each type. 
The 5 types of exceptions are: 


* — two levels of interrupt (fast and normal) 

* memory aborts (used to implement memory protection or virtual memory) 
* attempted execution of an undefined instruction 

* software interrupts (SWIs) (used to make a call to an Operating System) 


When an exception occurs, some of the standard registers are replaced with registers 
specific to the exception mode. All exceptions have replacement (or banked) registers 
for R14 and R13, and one interrupt mode has more registers for fast interrupt 
processing. 


After an exception, R14 holds the return address for exception processing, which is 
used both to return after the exception is processed and to address the instruction that 
caused the exception. 


R13 is banked across exception modes to provide each exception handler with private 
stack pointer (SP). The fast interrupt mode also banks registers 8 to 12, so that interrupt 
processing can begin without the need to save or restore these registers. There is 

a seventh processing mode, System mode, that does not have any banked registers 
(it uses the User mode registers), which is used to run normal (non-exception) tasks that 
require a privileged processor mode. 


CPSR and SPSR 


All other processor state is held in status registers. The current operating processor 
status is in the Current Program Status Register or CPSR. The CPSR holds: 


* Acondition code flags (Negative, Zero, Carry and Overflow) 
¢ 2 interrupt disable bits (one for each type of interrupt) 
¢« 5 bits which encode the current processor mode 


All 5 exception modes also have a Saved Program Status Register (SPSR) which holds 
the CPSR of the task immediately before the exception occurred. Both the CPSR and 
SPSR are accessed with special instructions. 


The exception process 


When an exception occurs, ARM halts execution after the current instruction and begins 
execution at a fixed address in low memory, known as the exception vectors. There is 
a separate vector location for each exception (and two for memory aborts to distinguish 
between data and instruction accesses). 


An operating system will install an handler on every exception at initialisation. Privileged 
operating system tasks are normally run in System mode to allow exceptions to occur 
within the operating system without state loss (exceptions overwrite their R14 when an 
exception occurs, and System mode is the only privileged mode that cannot be entered 
by an exception). 
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ARM Instruction Set 


The ARM instruction set can be divided into four broad classes of instruction: 
* — branch 
* data-processing 
* load and store 
* coprocessor 


Conditional execution 


All ARM instructions may be conditionally executed. Data-processing instructions 
(and one type of coprocessor instruction) can update the four condition code flags in 
the CPSR (Negative, Zero, Carry and Overflow) according to their result. Subsequent 
instructions can be conditionally executed according to the status of the condition code 
flags. Fifteen conditions are implemented, depending on particular values of 

the condition code flags; one condition actually ignores the condition code flag so that 
normal (unconditional) instructions always execute. 


Branch Instructions 


As well as allowing any data-processing or load instruction to change control flow 
(by writing the Program Counter) a standard branch instruction is provided with 24-bit 
signed offset, allowing forward and backward branches of up to 32Mbytes. 


There is a Branch with Link option that also preserves the address of the instruction after 
the branch in R14 (the Link Register or LR), allowing a move instruction to put the LR in 
to PC and return to the instruction after the branch, providing a subroutine call. 


There also a special type of branch instruction called software interrupt (SWI). 

This makes a call to the Operating System (to request an OS-defined service). SWI also 
changes the processor mode, allowing an unprivileged task to gain OS privilege (access 
to which is controlled by the OS). 


On processors that implement the THUMB instruction set there is a branch instruction 
that jumps to an address specified in a register, and optionally switches instruction set, 
allowing ARM code to call THUMB code and THUMB code to call ARM code. 

An overview of the THUMB instruction is provided in Chapter 6, The Thumb Instruction 
Set. 


Data-processing Instructions 


The data-processing instructions perform some operation on the general-purpose 
registers. There are three types of data-processing instructions: 


* data-processing instructions proper 
* — multiply instructions 
* — status register transfer instructions 


Arithmetic/logic instructions 


There are 16 arithmetic/logic instructions which share a common instruction format. 
This format takes up to two source operands, performs an arithmetic/logic operation on 
those operands, and then most store the result into a register, and optionally update 
the condition code flags according to that result. 
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Architecture Overview 


There are four data-processing instructions which don't store their result in a register. 
They compare the values of the source operands and then update the condition code 
flags. 


Of the two source operands: 
* one is always a register 
* — the other has two basic forms: 
- animmediate value 
-  aregister value, optionally shifted 


If the operand is a shifted register, the shift amount may an immediate value or the value 
of another register, and four types of shift can be specified. So, every data-processing 
instruction can perform a data-processing and a shift operation. As a result, ARM does 
not have dedicated shift instructions. 


Because the Program Counter (PC) is a general-purpose register, this class of 
data-processing instruction may write their results directly to the PC, allowing a variety 
of jump instructions to be easily implemented. 
Multiply instructions 
Multiply instructions come in two classes. Both types multiply the two 32-bit register 
values and store their result: 

(normal) 32-bit result store the 32-bit result in a register 

(long) 64-bit result store the 64 bit result in two separate registers 
Both types of multiply instruction can optionally perform an accumulate operation. 


Status register transfer instructions 


The status register transfer instructions transfer the contents of the CPSR or a SPSR to 
or from a general-purpose register. Writing to the CPSR is one way to set the value of 
the condition code flags and interrupt enable flags and to set the processor mode. 


Load and Store Instructions 
Load and store instruction come in three types: 
1 load or store the value of a single register 


2 load and store multiple register values 
3 swap a register value with the value of a memory location 


Load and store single register 


Load register instructions can load a 32-bit word, a 16-bit halfword or an 8-bit byte from 
memory into a register. Byte and halfword loads may be automatically zero- or 
sign-extended as they are loaded. 


Store register instructions can store a 32-bit word, a 16-bit halfword or an 8-bit byte from 
a register to memory. 
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Load and store instructions have three primary addressing modes that are formed by 
adding or subtracting an immediate or register-based offset to or from a base register 
(register-based offsets may also be scaled with shift operations): 

1 offset 

2 pre-indexed 

3. post-indexed 


Pre- and post-indexed addressing modes update the base register with the base plus 
offset calculation. As the Program Counter (PC) is a general-purpose register, a 32-bit 
value can be loaded directly into the PC to perform a jump to any address in the 4Gbyte 
memory space. 


Load and store multiple registers 
Load and Store multiple instructions perform a block transfer of any number of 
the general-purpose registers to or from memory. Four addressing modes are provided: 
1 pre-increment 
2 post-increment 
3. pre-decrement 
4 post-decrement 


The base address is specified by a register value (which may be optionally updated after 
the transfer). As the subroutine return address and Program Counter (PC) values are in 
general-purpose registers, very efficient subroutine call and return can be constructed 
with Load and Store Multiple; register contents and the return address can be stacked 
and the stack pointer updated with single store multiple instruction at procedure entry 
and then register contents restored, the PC loaded with the return address and the stack 
pointer updated on procedure return with a single load multiple). 


Of course, load and store multiple also allow very efficient data movement (for example, 
block copy). 
Swap a register value with the value of a memory location 


Swap can load a value from a register-specified memory location, store the contents of 
a register to the same memory location, then write the loaded value to a register. 


By specifying the same register as the load and store value, the contents of a memory 
location and a register are interchanged. 


The swap operation performs a special indivisible bus operation that allows atomic 
update of semaphores. Both 32-bit word and 8-bit byte semaphores are supported. 


Coprocessor Instructions 


There are three types of coprocessor instructions: 


data-processing instructions — start a coprocessor-specific internal operation 


register transfers allow a coprocessor value to be transferred to or 
from an ARM register 

data-transfer instructions transfer coprocessor data to or from memory, 
where the address of the transfer is calculated by 
the ARM 
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This chapter introduces the ARM Programmer's Model. 
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Data Types 

Processor Modes 
Registers 

Program Status Registers 
Exceptions 
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Data Types 


ARM Architecture Version 4 processors support the following data types: 
Byte 8 bits 
Halfword 16 bits; halfwords must be aligned to two-byte boundaries 
Word 32 bits; words must be aligned to four-byte boundaries 
ARM instructions are exactly one word (and therefore aligned on a four-byte boundary). 


THUMB instructions are exactly one halfword (and therefore aligned on a two-byte 
boundary). 


All data operations (e.g. ADD) are performed on word quantities. 


Load and store operations can transfer bytes, halfwords and words to and from memory, 
automatically zero-extending or sign-extending bytes or halfwords as they are loaded. 


Signed operands are in two’s complement format. 


Processor Modes 


ARM Version 4 supports seven processor modes: 





Processor mode Description 





1| User normal program execution mode 
2| FIQ supports a high-speed data transfer or channel process 
3} IRQ used for general purpose interrupt handling 
4| Supervisor a protected mode for the operating system 
5} Abort implements virtual memory and/or memory protection 
6} Undefined supports software emulation of hardware coprocessors 
7) System runs privileged operating system tasks 

(Architecture Version 4 only) 


Table 2-1: ARM Version 4 processor modes 


Mode changes may be made under software control or may be caused by external 
interrupts or exception processing. Most application programs will execute in User 
mode. The other modes, known as privileged modes, will be entered to service 
interrupts or exceptions or to access protected resources. 
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2.3. Registers 
The processor has a total of 37 registers: 


* 30 general-purpose registers 
* 6 status registers 
* aprogram counter 


The registers are arranged in partially overlapping banks: a different register bank for 
each processor mode. At any one time, 15 general-purpose registers (RO to R14), 
one or two status registers and the program counter are visible. The general-purpose 
registers and status registers currently visible depend on the current processor mode. 
The register bank organisation is shown in Figure 2-1: Register organisation on 

page 2-4. The banked registers are shaded in the diagram. 


The general-purpose registers are 32 bits wide. 


Register 13 (the Stack Pointer or SP) is banked across all modes to provide a private 
Stack Pointer for each mode (except system mode which shares the user mode R193). 


Register 14 (the Link Register or LR) is used as the subroutine return address link 
register. R14 is also banked across all modes (except system mode which shares the 
user mode R14). When a Subroutine call (Branch and Link instruction) is executed, R14 
is set to the subroutine return address; R14 _svc, R14_irq, R14_figq, R14 abort and 
R14_undef are used similarly to hold the return address when exceptions occur 

(or a subroutine return address if subroutine calls are executed within interrupt or 
exception routines). R14 may be treated as a general-purpose register at all other times. 


FIQ mode also has banked registers R8 to R12 (as well as R13 and R14). R8_fiq, 
R9_fiq, R10_fiq, R11__ fiq and R12_fiq are provided to allow very fast interrupt 
processing (without the need to preserve register contents by storing them to memory), 
and to preserve values across interrupt calls (so that register contents do not need to 
be restored from memory). 


Register R15 holds the Program Counter (PC). When R15 is read, bits [1:0] are zero 
and bits [31:2] contain the PC. When R15 is writte, n bits[1:0] are ignored and bits[31 :2] 
are written to the PC. Depending on how it is used, the value of the PC is either the 
address of the instruction plus 8 or is UNPREDICTABLE. 








The Current Program Status Register (CPSR) is also accessible in all processor modes. 
It contains condition code flags, interrupt enable flags and the current mode. Each 
privileged mode (except system mode) has a Saved Program Status Register (SPSR) 
which is used to preserve the value of the CPSR when an exception occurs. 


2.4 Program Status Registers 


The format of the Current Program Status Register (CPSR) and the Saved Program 
Status registers (SPSR) are shown in Figure 2-2: Format of the program status 
registers on page 2-4. The N, Z, C and V (Negative, Zero, Carry and oVerflow) bits are 
collectively known as the condition code flags. The condition code flags in the CPSR 
can be changed as a result of arithmetic and logical operations in the processor and can 
be tested by all instructions to determine if the instruction is to be executed. 
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User/System | Supervisor Abort Undefined Interrupt Fast interrupt 








R3 





R4 





R5 





R6 





R7 





R8_FIQ 








R9_FIQ 





R10_FlQ 





R11 R11 R11 R11_FlQ 





R12 R12 R12 R12_FlQ 





R13_SVC DS R13_ABORT R13 UNDEF R13_IRQ R13_FlQ 








R14 SVC R14_ ABORT R14_UNDEF R14_IRQ R14_FIQ 











PC PC PC 














Figure 2-1: Register organisation 
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Figure 2-2: Format of the program status registers 
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The bottom 8 bits of a PSR (incorporating |, F, T and M[4:0]) are known collectively as 
the control bits. The control bits change when an exception arises and can be altered 
by software only when the processor is in a privileged mode. The | and F bits are the 
interrupt disable bits: 


| bit disables IRQ interrupts when it is set 
F bit disables FIQ interrupts when it is set 
The T flag is only implemented on Architecture Version 4T (THUMB): 
0 indicates ARM execution 
1 indicates THUMB execution 


On all other version of the architecture the T flag should be zero (SBZ). 


The MO, M1, M2, M3 and M4 bits (M[4:0]) are the mode bits, and these determine the 
mode in which the processor operates. The interpretation of the mode bits is shown in 
Table 2-2: The mode bits. Not all combinations of the mode bits define a valid processor 
mode. Only those explicitly described can be used; if any other value is programmed 
into the mode bits M[4:0], the result is unpredictable. 


M[4:0] Mode Accessible Registers 


0b10000 | User PC, R14 to RO, CPSR 

0b10001 FIQ PC, R14_fiq to R8_fig, R7 to RO, CPSR, SPSP_fiq 
0b10010 | IRQ PC, R14_irg, R13_irg,R12 to RO, CPSR, SPSR_irq 
0b10011 SVC PC, R14_svc, R13_svc,R12 to RO, CPSR, SPSR_sve 


0b10111 Abort PC, R14_abt, R13_abt,R12 to RO, CPSR, SPSR_abt 
0b11011 Undef PC, R14_und, R13_und,R12 to RO, CPSR, SPSR_und 








0b11111 System PC, R14 to RO, CPSR (Architecture Version 4 only) 
Table 2-2: The mode bits 


User mode and system mode do not have an SPSR, as these modes are not entered 
on any exception, so a register to preserve the CPSR is not required. In User mode or 
System mode any reads to the SPSR will read an unpredictable value, and any writes 
to the SPSR will be ignored. 
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Exceptions 


Exceptions are generated by internal and external sources to cause the processor 

to handle an event; for example, an externally generated interrupt, or an attempt to 
execute an undefined instruction. The processor state just before handling the exception 
must be preserved so that the original program can be resumed when the exception 
routine has completed. More than one exception may arise at the same time. 


ARM supports 7 types of exception and has a privileged processor mode for each type 
of exception. Table 2-3: Exception processing modes lists the types of exception and 
the processor mode that is used to process that exception. When an exception occurs 
execution is forced from a fixed memory address corresponding to the type of exception. 
These fixed addresses are called the Hard Vectors. 


The reserved entry at address 0x14 is for an Address Exception vector used when the 
processor is configured for a 26-bit address space. See Chapter 5, The 26-bit 
Architectures for more information. 


Exception type Mode Vector address 
Reset SVC 0x00000000 
Undefined instructions UNDEF | 0x00000004 
Software Interrupt (SWI) SVC 0x00000008 
Prefetch Abort (Instruction fetch memory abort) ABORT | 0x0000000c 
Data Abort (Data Access memory abort) ABORT | 0x00000010 
IRQ (Interrupt) IRQ 0x00000018 
FIQ (Fast Interrupt) FIQ 0x0000001c 








Table 2-3: Exception processing modes 


When taking an exception, the banked registers are used to save state. When an 
exception occurs, these actions are performed: 


R14_<exception_mode> = PC 
SPSR_<exception_mode> = CPSR 


CPSR[5:0] = Exception mode number 
CPSR[6] = if <exception_mode> == Reset or FIQ then = 1 else unchanged 
CPSR[7] = 1; Interrupt disabled 


PC = Exception vector address 


To return after handling the exception, the SPSR is moved into the CPSR and R14 is 
moved to the PC. This can be done atomically in two ways: 


1 Using a data-processing instruction with the S bit set, and the PC as the 
destination. 


2 __Using the Load Multiple and Restore PSR instruction. 
The following sections show the recommended way of returning from each exception. 
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2.5.1 Reset 


When the processor’s Reset input is asserted, ARM immediately stops execution of the 
current instruction. When the Reset is de-asserted, the following actions are performed: 


R14_sve = unpredictable value 
SPSR_svc = CPSR 


CPSR[5:0] = 0b010011 ; Supervisor mode 

CPSR[6] = 1 ; Fast Interrupts disabled 
CPSR[7] = 1 ; Interrupts disabled 

PC = 0x0 


Therefore, after reset, ARM begins execution at address Ox0 in supervisor mode with 
interrupts disabled. See 7.6 Memory Management Unit (MMU) Architecture on 
page 7-14 for more information on the effects of Reset. 


2.5.2 Undefined instruction exception 


If ARM executes a coprocessor instruction, it waits for any external coprocessor 

to acknowledge that it can execute the instruction. If no coprocessor responds, 

an undefined instruction exception occurs. If an attempt is made to execute 

an instruction that is undefined, an undefined instruction exception occurs (see 3.14.5 
Undefined instruction Space on page 3-27). 


The undefined instruction exception may be used for software emulation of 
a coprocessor in a system that does not have the physical coprocessor (hardware), 
or for general-purpose instruction set extension by software emulation. 


When an undefined instruction exception occurs, the following actions are performed: 


R14_und = address of undefined instruction + 4 
SPSR_und = CPSR 


CPSR[5:0] = 0b011011 ; Undefined mode 

CPSR[6] = unchanged ; Fast Interrupt status is unchanged 
CPSR[7] = 1 ; (Normal) Interrupts disabled 

PC = 0x4 


To return after emulating the undefined instruction, use: 


MOVS PC,R14 


This restores the PC (from R14_und) and CPSR (from SPSR_und) and returns to 
the instruction following the undefined instruction. 


2.5.3 Software interrupt exception 


The software interrupt instruction (SWI) enters Supervisor mode to request a particular 
supervisor (Operating System) function. When a SWI is executed, the following are 
performed: 
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R14_sve = address of SWI instruction + 4 
SPSR_svce = CPSR 


CPSR[5:0] = 0b010011 ; Supervisor mode 

CPSR[6] = unchanged ; Fast Interrupt status is unchanged 
CPSR[7] = 1 ; (Normal) Interrupts disabled 

PC = 0x8 


To return after performing the SWI operation, use: 


MOVS PC,R14 


This restores the PC (from R14_svc) and CPSR (from SPSR_svc) and returns to 
the instruction following the SWI. 


2.5.4 Prefetch Abort (Instruction Fetch Memory Abort) 


A memory abort is signalled by the memory system. Activating an abort in response to 
an instruction fetch marks the fetched instruction as invalid. An abort will take place if 
the processor attempts to execute the invalid instruction. If the instruction is not 
executed (for example as a result of a branch being taken while it is in the pipeline), 
no prefetch abort will occur. 


When an attempt is made to execute an aborted instruction, the following actions are 
performed: 


R14_abt = address of the aborted instruction + 4 
SPSR_abt = CPSR 


CPSR[5:0] = 0b010111 ; Abort mode 

CPSR[6] = unchanged ; Fast Interrupt status is unchanged 
CPSR[7] = 1 ; (Normal) Interrupts disabled 

PC = Oxc 


To return after fixing the reason for the abort, use: 


SUBS PC,R14, #4 


This restores both the PC (from R14_abt) and CPSR (from SPSR_abt) and returns to 
the aborted instruction. 


2.5.5 Data Abort (Data Access Memory Abort) 


2-8 


A memory abort is signalled by the memory system. Activating an abort in response to 
a data access (Load or Store) marks the data as invalid. A data abort exception will 
occur before any following instructions or exceptions have altered the state of the CPU, 
and the following actions are performed: 


R14_abt = address of the aborted instruction + 8 
SPSR_abt = CPSR 


CPSR[5:0] = 0b010111 ; Abort mode 

CPSR[6] = unchanged ; Fast Interrupt status is unchanged 
CPSR[7] = 1 ; (Normal) Interrupts disabled 

PC = 0x10 
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To return after fixing the reason for the abort, use: 


SUBS PC,R14,#8 
This restores both the PC (from R14_abt) and CPSR (from SPSR_abt) and returns to 
re-execute the aborted instruction. 
If the aborted instruction does not need to be re-executed use: 


SUBS PC,R14, #4 


The final value left in the base register used in memory access instructions which 
specify writeback and generate a data abort (_.DR, LDRH, LDRSH, LDRB, LDRSB, 
STR, STRH, STRB, LDM, STM, LDC, STC) is IMPLEMENTATION DEFINED. 


An implementation can choose to leave either the original value or the updated value in 
the base register, but the same behaviour must be implemented for all memory access 
instructions. 


2.5.6 IRQ (Interrupt Request) exception 


The IRQ (Interrupt ReQuest) exception is externally generated by asserting the 
processor's IRQ input. It has a lower priority than FIQ (see below), and is masked out 
when a FIQ sequence is entered. Interrupts are disabled when the | bit in the CPSR is 
set (but note that the | bit can only be altered from a privileged mode). If the | flag is clear, 
ARM checks for a IRQ at instruction boundaries. 


When an IRQ is detected, the following actions are performed: 


R14_irg = address of next instruction to be executed + 4 
SPSR_irq = CPSR 


CPSR[5:0] = 0b010010 ; Interrupt mode 

CPSR[6] = unchanged ; Fast Interrupt status is unchanged 
CPSR[7] = 1 ; (Normal) Interrupts disabled 

PC = 0x18 


To return after servicing the interrupt, use: 


SUBS PC,R14, #4 


This restores both the PC (from R14_irq) and CPSR (from SPSR_irq) and resumes 
execution of the interrupted code. 


2.5.7 FIQ (Fast Interrupt Request) exception 


The FIQ (Fast Interrupt reQuest) exception is externally generated by asserting the 
processor's FIQ input. FIQ is designed to support a data transfer or channel process, 
and has sufficient private registers to remove the need for register saving in such 
applications (thus minimising the overhead of context switching). 


Fast interrupts are disabled when the F bit in the CPSR is set (but note that the F bit can 
only be altered from a privileged mode). If the F flag is clear, ARM checks for a FIQ at 
instruction boundaries. 
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When a FIQ is detected, the following actions are performed: 


R14_fig = address of next instruction to be executed + 4 
SPSR_fig = CPSR 


CPSR[5:0] = 0b010001 ; FIQ mode 

CPSR[6] = unchanged ; Fast Interrupt disabled 
CPSR[7] = 1 ; Interrupts disabled 

PC = Oxlc 


To return after servicing the interrupt, use: 


SUBS PC, R14,#4 


This restores both the PC (from R14_fiq) and CPSR (from SPSR_fiq) and resumes 
execution of the interrupted code. 


The FIQ vector is deliberately the last vector to allow the FIQ exception-handler software 
to be placed directly at address 0x1c, and not require a branch instruction from 
the vector. 


2.5.8 Exception priorities 


The Reset exception has the highest priority. FIQ has higher priority than IRQ. IRQ has 
higher priority than prefetch abort. 


Undefined instruction and software interrupt cannot occur at the same time, as they 
each correspond to particular (non-overlapping) decodings of the current instruction, 
and both must be lower priority than prefetch abort, as a prefetch abort indicates that no 
valid instruction was fetched. 


The priority of data abort is higher than FIQ and lower priority than Reset, which ensures 
that the data-abort handler is entered before the FIQ handler is entered (so that the data 
abort will be resolved after the FIQ handler has completed). 





Exception Priority 
Reset 1 (Highest) 
Data Abort 2 

FIQ 3 

IRQ 4 

Prefetch Abort 6) 
Undefined Instruction, SWI 6 (Lowest) 


Table 2-4: Exception priorities 
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This chapter describes the ARM instruction set. 
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Load and Store Word or Unsigned Byte Instructions 

Load and Store Halfword and Load Signed Byte Instructions 
Load and Store Multiple Instructions 

Semaphore Instructions 

Coprocessor Instructions 

Extending the Instruction Set 

Alphabetical List of ARM Instructions 

Data-processing Operands 

Load and Store Word or Unsigned Byte Addressing Modes 
Load and Store Halfword or Load Signed Byte Addressing Modes 
Load and Store Multiple Addressing Modes 

Load and Store Multiple Addressing Modes (Alternative names) 
Load and Store Coprocessor Addressing Modes 
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Using this Chapter 


This chapter is divided into three parts: 

1 Overview of the ARM instruction types 
2 Alphabetical list of instructions 

3 Addressing modes 


Overview of the ARM instruction types (page 3-3 through 3-27) 


This part describes the functional groups within the instruction set, and shows relevant 
examples and encodings. Each functional group lists all its instructions, which you can 
then find in the alphabetical section. The functional groups are: 


1 Branch 

2 Data processing 

3 Multiply 

4 Status register access 
5 Load and store: 


- load and store word or unsigned byte 
- load and store halfword and load signed byte 
- load and store multiple 


6 Semaphore 
7 Coprocessor 


Alphabetical list of instructions (page 3-30 through 3-81) 


This part lists every ARM instruction, and gives: 


instruction syntax and functional group 

encoding and operation 

relevant exceptions and qualifiers 

notes on usage 

restrictions on availability in versions of the ARM architecture 
a cross-reference to the relevant addressing modes 


Addressing modes (page 3-84 through 3-126) 


This part lists the addressing modes for the functional groups of instructions: 


Mode 1 Shifter operands for data processing instuctions 

Mode 2 Load and store word or unsigned byte addressing modes 
Mode 3 Load and store halfword or load signed byte addressing modes 
Mode 4 Load and store multiple addressing modes 

Mode 5 Load and store coprocessor addressing modes 
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3.2 Instruction Set Overview 


Table 3-1: ARM instruction set overview (expanded) shows the instruction set 
encoding. All other bit patterns are UNPREDICTABLE. 
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Data processing Immediate op rotate immediate 
Data processing Immediate shift opcode shift immed shift 

Data processing register shift opcode 

Multiply 

Multiply long 


Move from Status register 

Move immediate to Status register 
Move register to Status register 
Branch/Exchange instruction set 
Load/Store immediate offset 
Load/Store register offset 
Load/Store halfword/signed byte 
Load/Store halfword/signed byte 
Swap/Swap byte 

Load/Store multiple 
Coprocessor data processing 
Coprocessor register transfers 
Coprocessor load and store 
Branch and Branch with link 
Software interrupt 


Undefined instruction 
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Table 3-1: ARM instruction set overview (expanded) 
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The ARM Instruction Set 
3.3 The Condition Field 


All ARM instructions can be conditionally executed, which means that their execution 
may or may not take place depending on the values of values of the N, Z, C and V flags 
in the CPSR. Every instruction contains a 4-bit condition code field in bits 31 to 28, 
as shown in Figure 3-1: Condition code fields. 


31 28 27 0 


Ee 


Figure 3-1: Condition code fields 


3.3.1 Condition codes 


This field specifies one of 16 conditions as described in Table 3-2: Condition codes on 
page 3-5. Every instruction mnemonic may be extended with the letters defined in 
the mnemonic extension field. 

If the always (AL) condition is specified, the instruction will be executed irrespective of 
the value of the condition code flags. Any instruction that uses the never (NV) condition 
is UNPREDICTABLE. The absence of a condition code on an instruction mnemonic implies 
the always (AL) condition code. 
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Opcode [31:28] | Mnemonic Extension | Meaning Status flag state 
0000 EQ Equal Z set 
0001 NE Not Equal Z clear 
0010 CS/HS Carry Set /Unsigned Higher or Same | C set 
0011 CC/LO Carry Clear /Unsigned Lower C clear 
0100 MI Minus / Negative N set 
0101 PL Plus /Positive or Zero N clear 
0110 VS Overflow V set 
0111 VC No Overflow V clear 
1000 HI Unsigned Higher C set and Z clear 
1001 LS Unsigned Lower or Same C clear or Z set 
1010 GE Signed Greater Than or Equal N set and V set, or 
N clear and V clear (N = V) 
1011 LT Signed Less Than N set and V clear, or 
N clear and V set (N != V) 
1100 GT Signed Greater Than Z clear, and either N set and V set, or 
N clear and V clear (Z = 0,N = V) 
1101 LE Signed Less Than or Equal Z set, or N set and V clear, or 
N clear and V set (Z = 1, N != V) 
1110 AL Always (unconditional) - 
1111 NV Never - 
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Branch Instructions 


3.4 


3.4.1 


3-6 


func 


All ARM processors support a branch instruction that allows a conditional branch 
forwards or backwards up to 32 Mbytes. As the Program Counter (PC) is one of 

the general-purpose registers (register 15), a branch or jump can also be generated by 
writing a value to register 15. 


A subroutine call is a variant of the standard branch; as well as allowing a branch 
forward or backward up to 32 Mbytes, the Branch with Link instruction preserves 
the address of the instruction after the branch (the return address) in register 14 

(the link register or LR). 


Lastly, a load instruction provides a way to branch any where in the 4Gbyte address 
space (known as a long branch). A 32-bit value is loaded directly from memory into 
the PC, causing a branch. The load instruction may be preceded with a Move 
instruction to store a return address in the link register (LR or R14)). 








Examples 
B label ; branch unconditionally to label 
BCC label 7 branch to label if carry flag is clear 
BEQ label ; branch to label if zero flag is set 
OV PC, #0 ; R15 = 0, branch to location zero 
BL func ; subroutine call to function 
MOV PC, LR ; R1I5=R14, return to instruction after the BL 
MOV LR, PC ; store the address of the instruction after 


; the next one into R14 ready to return 
LDR PC, =func ; load a 32 bit value into the program counter 


Processors that support the Thumb instruction set (Architecture v4T) also support 
a branch instruction (BX) that jumps to a given address, and optionally switches 
executing Thumb instructions. 


List of branch instructions 


B, BL Branch, and branch with link page 3-33 
BX Branch and exchange instruction set page 3-35 
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3.5 Data Processing 


ARM has 16 data-processing instructions. Most data-processing instructions take two 
source operands (Move and Move Not have only one operand) and store a result in 
a register (except for the Compare and Test instructions which only update 

the condition codes). Of the two source operands, one is always a register, the other 
is called a shifter operand, and is either an immediate value or a register. If the second 
operand is a register value, it may have a shift applied to it before it is used as 

the operand to the ALU. 








Mnemonic | Operation Opcode | Action 

MOV Move 1101 Rd := shifter_operand (no first operand) 

MVN Move Not 1111 Rd := NOT shifter_operand (no first operand) 
ADD Add 0100 Rd := Rn + shifter_operand 

ADC Add with Carry 0101 Rd := Rn + shifter_operand + Carry Flag 
SUB Subtract 0010 Rd := Rn - shifter_operand 

SBC Subtract with Carry 0110 Rd := Rn - shifter_operand - NOT(Carry Flag) 
RSB Reverse Subtract 0011 Rd := shifter_operand - Rn 

RSC Reverse Subtract with Carry | 0111 Rd := shifter_operand - Rn - NOT(Carry Flag) 
AND Logical AND 0000 Rd := Rn AND shifter_operand 

EOR Logical Exclusive OR 0001 Rd := Rn EOR shifter_operand 

ORR Logical (inclusive) OR 1100 Rd := Rn OR shifter_operand 

BIC Bit Clear 1110 Rd := Rn AND NOT shifter_operand 

CMP Compare 1010 update flags after Rn - shifter_operand 

CMN Compare Negated 1011 update flags after Rn + shifter_operand 

TST Test 1000 update flags after Rn AND shifter_operand 
TEQ Test Equivalence 1001 update flags after Rn EOR shifter_operand 














Table 3-3: Data-processing instructions 
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3.5.1 Instruction encoding 


<opcodel>{<cond> 
<opcodel> MOV 


<opcode2> {<cond> 
<opcode2> CMP 


<opcode3>{<cond> 
<opcode3> ADD 


28-27 26 25 24 


{S} 
| MVN 


Rd, <shifter_operand> 


Rn, <shifter_operand> 
| CMN | TST | TEQ 





{S} Rd, Rn, <shifter_operand> 
SUB|RSB|ADC|SBC|RSC|AND|BIC|EOR|ORR 





21,20 19 16.15 12°44 0 





SS 


Rn 


<shifter_operand> 


Notes 


specifies the destination register 
specifies the first source operand register 


specifies the second source operand. 
See 3.16 Data-processing Operands on 
page 3-84 for details of the shifter operands. 


The | bit: Bit 25 is used to distinguish between the immediate and register forms of 
<shifter_operand>. 


The S bit: Bit 20 is used to signify that the instruction updates the condition codes. 


3.5.2 Condition code flags 


Data-processing instructions can update the four condition code flags. 
CMP, CMN, TST and TEQ always update the condition code flags; the remaining 


instructions will update the 
(which sets the S bit in the 


Bits are set as follows: 
N (Negative) flag 


Z (Zero) flag 
C (Carry) flag 


V (Overflow) flag 
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flags if an S is appended to the instruction mnemonic 
instruction). 


is set if the result of a data-processing instruction is 
negative 


is set if the result is equal to zero 


is set if an add, subtract or compare causes a carry 
(result bigger than 32 bits), or is set from the output of 
the shifter for move and logical instructions 


is set if an Add or Subtract, or compare overflows 
(signed result bigger than 31 bits); unaffected by move or 
conditional instructions 
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3.5.3 List of data-processing instructions 














ADC Add with Carry page 3-30 
ADD Add page 3-31 
AND Logical AND page 3-32 
BIC Logical Bit Clear page 3-34 
CMN Compare Negative page 3-37 
CMP Compare page 3-38 
EOR Logical EOR page 3-39 

OV Move page 3-53 

VN Move negative page 3-59 
ORR Logical OR page 3-60 
RSB Reverse Subtract page 3-61 
RSC Reverse Subtract with Carry page 3-62 
SBC Subtract with Carry page 3-63 
SUB Subtract page 3-74 

EQ Test Equivalence page 3-78 
TST Test page 3-79 
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Multiply Instructions 


3.6 


3.6.1 


3.6.2 


3-10 


Normal multiply 


Long multiply 


ARM has two classes of multiply instruction: 
* normal, 32-bit result 
¢ — long, 64-bit result 


All multiply instructions take two register operands as the input to the multiplier. 
ARM does not directly support a multiply-by-constant instruction due to the efficiency 
of shift and add, or shift and reverse subtract instructions. 


There are two multiply instructions that produce 32-bit results: 


MUL multiplies the values of two registers together, truncates the result to 
32 bits, and stores the result in a third register. 


MLA multiplies the values of two registers together, adds the value of a third 


register, truncates the result to 32 bits, and stores the result into a fourth 
register (i.e. performs multiply and accumulate). 


Both multiply instructions can optionally set the N (Negative) and Z (Zero) condition 
code flags. No distinction is made between signed and unsigned variants; only 

the least-significant 32 bits of the result are stored in the destination register, and 
the sign of the operands does not affect this value. 


MUL R4, R2, R1 ; Set R4 to value of R2 multiplied by Rl 
MULS R4, R2, R1 ; R4 R2 x Rl, set N and Z flags 
MLA Ri, RS, RY, RS ; R7 Re x RO +R 


There are four multiply instructions that produce 64-bit results (long multiply). 


Two of the variants multiply the values of two registers together and store the 64-bit 
result in a third and fourth register. There are a signed (SMULL) and unsigned (UMULL) 
variants. (The signed variants produce a different result in the most significant 32 bits 
if either or both of the source operands is negative). 

The remaining two variants multiply the values of two registers together, add the 64-bit 
value from and third and fourth register and store the 64-bit result back into those 
registers (third and fourth). There are also signed (SMLAL) and unsigned (UMLAL) 
variants. These instruction perform a long multiply and accumulate. 

All four long multiply instructions can optionally set the N (Negative) and Z (Zero) 
condition code flags: 


SMULL R4, R8, R2, R3 ; R4 = bits 0 to 31 of R2 x R3 
; R8 = bits 32 to 63 of R2 x R3 
UMULL R6, R8, RO, R1 ; R6, R8 = RO x R1 

UMLAL R5, R8, RO, R1 ; R5, R8 RO x Rl + R5, R8 
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3.6.3 List of multiply instructions 














MLA Multiply accumulate page 3-52 
MUL Multiply page 3-58 
SMLAL Signed multiply accumulate long page 3-64 
SMULL Signed multiply long page 3-65 
UMLAL Unsigned multiply accumulate long page 3-80 
UMULL Unsigned multiply long page 3-81 
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3.7 

3.7.1 CPSR value 
3.7.2 Examples 
3.7.3 

3-12 


Status Register Access 


There are two instructions for moving the contents of a program status register to or 
from a general-purpose register. Both the CPSR and SPSR can be accessed. 
Each status register is split into four 8-bit fields than can be individually written: 


bits 31 to 24 the flags field 

bits 23 to 16 the status field 
bits 15 to 8 the extension field 
bits 7 to 0 the control field 


ARMvV4 does not use the status and extension field, and 4 bits are unused in the flags 
field. The four condition code flags occupy the remaining four bits of the flags field, and 
the control field contains two interrupt disable bits, 5 processor mode bits, and the 
Thumb bit on ARMV4T (See 2.4 Program Status Registers on page 2-3). 

The unused bits of the status registers may be used in future ARM architectures, and 
should not be modified by software. Therefore, a read-modify write strategy should be 
used to update the value of a status register to ensure future compatibility. The status 
registers are readable to allow the read part of the read-modify-write operation, and 
to allow all processor state to be preserved (for instance, during process content 
switches). The status registers are writeable to allow the write part of the 
read-modify-write operation, and allow all processor state to be restored. 


Altering the value of the CPSR has three uses: 

1 Sets the value of the condition code flags to a known value. 

2 Enables or disables interrupts. 

3 Changes processor mode (for instance to initialise stack pointers). 











; now in FIQ mode 


MRS 
MSR 


List of status register access instructions 


Move SR to general-purpose register 
Move general-purpose register to SR 
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RS RO, CPSR ; Read the CPSR 

BIC RO, RO, #0x£0000000 ; Clear the N, Z, C and V bits 

SR CPSR_f, RO ; update the flag bits in the CPSR 

; N, Z, C and V flags now all clear 

RS RO, CPSR ; Read the CPSR 
ORR RO, RO, #0x80 ; Set the interrupt disable bit 

SR CPSR_c, RO ; Update the control bits in the CPSR 
; interrupts (IRQ) now disabled 

RS RO, CPSR ; Read the CPSR 

BIC RO, RO, #0x1f ; Clear the mode bits 
ORR RO, RO, #0x11 ; Set the mode bits to FIQ mode 

SR CPSR_c, RO ; Update the control bits in the CPSR 


page 3-55 
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3.8 Load and Store Instructions 


ARMV4 supports two broad classes of instruction which load or store the value of 
a single register from or to memory. 


¢ The first form can load or store a 32-bit word or an 8-bit unsigned byte 


* — The second form can load or store a 16-bit unsigned halfword, and can load 
and sign extend a 16-bit halfword or an 8-bit byte 


The first form (word and unsigned byte) allows a wider range of addressing modes 
the second (halfword and signed byte). The Word and Unsigned Byte addressing 
mode comes in two parts: 


« — the base register 
« — the offset 


The base register is any one of the general-purpose registers (including the PC, 
which allows PC-relative addressing for position-independent code). 


The offset takes one of three forms: 


1 Immediate Offset: 
The offset is a 12-bit unsigned number, that may be added to or subtracted from 
the base register. Immediate Offset addressing is useful for accessing data 
elements that are a fixed distance from the start of the data object, such as 
structure fields, stack offsets and IO registers. 


2 Register Offset: 
The offset is a general-purpose register (not the PC), that may be added to or 
subtracted from the base register. Register offsets are useful for accessing arrays 
or blocks of data. 


3 Scaled Register Offset: 
The offset is a general-purpose register (not the PC) shifted by an immediate 
value, then added to or subtracted from the base register. The same shift 
operations used for data-processing instructions can be used (Logical Shift Left, 
Logical Shift Right, Arithmetic Shift Right and Rotate Right), but Logical Shift Left 
is the most useful as it allows an array indexed to be scaled by size of each array 
element. 


As well as the three forms of offset, the offset and base register are used in three 
different ways to form the memory address. 


1 Offset addressing: 
The base register and offset are simply added or subtracted to form the memory 
address. 


2 _Pre-indexed addressing: 
The base register and offset are added or subtracted to form the memory address. 
The base register is then updated with this new address, to allow automatic 
indexing through an array or memory block. 

3 Post-indexed addressing: 
The value of the base register alone is used as the memory address. The base 
register and offset are added or subtracted and this value is stored back in 
the base register, to allow automatic indexing through an array or memory block. 
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3.8.1 Examples 











LDR R1, [RO] ; Load register 1 from the address in register 0 
LDR R8, [R3, #4] ; Load R8 from the address in R3 + 4 

LDR R12, [R13, #-4] ; Load R12 from R13 - 4 

STR R2, [Rl, #0x100] ; Store R2 to the address in R1 + 0x100 

LDRB R5, [R9] ; Load a byte into R5 from R9 (zero top 3 bytes) 
LDRB R3, [R8, #3] ; Load byte to R3 from R8 + 3 (zero top 3 bytes) 
STRB R4, [R10, #0x200] ; Store byte from R4 to R1O + 0x200 

LDR R11, [Rl, R2] ; Load R11 from the address in Rl + R2 

STRB R10, [R7, —-R4] ; Store byte from R10 to the address in R7 - R4 
LDR R11, [R3,R5,LSL #2] ; Load R11 from R3 + (R5 x 4) 

LDR Rl, [RO, #4]! ; Load R1 from RO + 4, then RO = RO + 4 

STRB R7, [R6, #-1]! ; Store byte from R7 to R6 - 1, then R6 = R6 - 1 
LDR R3, [R9], #4 ; Load R3 from R9, then R9 = R9 + 4 

STR R2, [R5], #8 ; Store word from R2 to R5, then R5 = R5 + 8 
LDR RO, [PC, #40] ; Load RO from PC + 8 + 0x40 

LDR RO, RIL, RZ : Load RO from R1, then Rl = R1 + R2 























3.8.2 Examples of halfword and signed byte addressing modes 


The Halfword and Signed Byte addressing modes are a subset of the above addressing modes. The scaled 
register offset is not supported, and the immediate offset contains 8 bits, not 12. 























LDRH R1, [RO] ; Load a halfword to Rl from RO (zero top bytes) 

LDRH R8, [R3, #2] ; Load a halfword into R8 from R3 + 2 

LDRH R12, [R13, #-6] ; Load a halfword R12 from R13 - 6 

STRH R2, [Rl, #0x80] ; Store halfword from R2 to R1 + 0x80 

LDRSH R5, [R9] ; Load signed halfword to R5 from R9 

LDRSB R3, [R8, #3] ; Load signed byte to R3 from R8 + 3 

LDRSB R4, [R10, #0xcl1] ; Load signed byte to R4 from R10 + Oxcl 

LDRH R11, [Rl, R2] ; Load halfword R11 from the address in Rl + R2 

STRH R10, [R7, -—R4] ; Store halfword from R10 to R7 —- R4 

LDRSH R1, [RO, #2]! ; Load signed halfword Rl from RO+2,then RO=RO0+2 

LDRSB R7, [R6, #-1]! ; Load signed byte to R7 from R6-1, then R6=R6-1 

LDRH R3, [R9], #2 ; Load halfword to R3 from R9, then R9 = RO + 2 

STRH R2, [R5], #8 ; Store halfword from R2 to R5, then R5 = R5 + 8 
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3.9 Load and Store Word or Unsigned Byte Instructions 


Load instructions load a single value from memory and write it to a general-purpose 
register. 


Store instructions read a value from a general-purpose register and store it to memory. 
Load and store instructions have a single instruction format: 
LDR|STR{<cond>}{B} Rd, <addressing_mode> 


28 27 26 25 24 23 22 21 20 19 


Sc penile vies re 


The I, P, U and W bits: These bits distinguish between different types of 
<addressing_mode>. 


The L bit: This bit distinguishes between a Load (L==1) and a Store instruction (L==0). 








3.9.1 List of load and store word or unsigned byte instructions 
LDR Load word page 3-44 
LDRB Load byte page 3-45 
LDRBT Load byte with user mode privilege page 3-46 
LDRT Load word with user mode privilege page 3-50 
STR Store word page 3-69 
STRB Store byte page 3-70 
STRBT Store byte with user mode privilege page 3-71 
STRT Store word with user mode privilege page 3-73 
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3.10 Load and Store Halfword and Load Signed Byte Instructions 


Load instructions load a single value from memory and write it to a general-purpose 
register. 


Store instructions read a value from a general-purpose register and store it to memory. 


Load and store halfword and load signed byte instructions have a single instruction 
format: 


LDR|STR{<cond>}H|SH|SB Rd, <addressing_mode> 


827 26 25 24 23 22 21 20 19 





The addr_mode bits: These bits are addressing mode specific. 


The I, P, U and W bits: These bits specify the type of <addressing_mode> 
(see section 3.18 Load and Store Halfword or Load Signed Byte 
Addressing Modes on page 3-109). 


The L bit: This bit distinguishes between a Load (L==1) and a Store instruction (L==0). 


The S bit: This bit distinguishes between a signed (S==1) and an unsigned (S==0) 
halfword access. 


The H bit: This bit distinguishes between a halfword (H==1) and a signed byte (H==0) 
access. 


3.10.1 List of load and store halfword and load signed byte instructions 





LDRH Load unsigned halfword page 3-47 

LDRSB Load signed byte page 3-48 

LDRSH Load signed halfword page 3-49 

STRH Store halfword page 3-72 
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Load and Store Multiple Instructions 


Load Multiple instructions load a subset (or possibly all) of the general-purpose 
registers from memory. 


Store Multiple instructions store a subset (or possibly all) of the general-purpose 

registers to memory. 

Load and Store Multiple instructions have a single instruction format. 
LDM{<cond>}<addressing_mode> Rn{!}, <register_list>{%*} 
STM{<cond>}<addressing_mode> Rn{!}, <registers>{%} 

where: 
<addressing_mode> = IA | IB | DA | DB | FD | FA | ED | EA 














28 27 26 25 24 23 22 21 20:19 1615 0 


fee [oe espe vie ps 


The register list: The <register_list> has one bit for each general-purpose 
register; bit O is for register zero, and bit 15 is for register 15 (the PC). 

The register syntax list is an opening bracket, followed by a comma-separated list 
of registers, followed by a closing bracket. A sequence of consecutive registers 
may be specified by separating the first and last registers in the range with a minus 
sign. 

The P, U and W bits: These bits distinguish between the different types of addressing 
mode. See 3.19 Load and Store Multiple Addressing Modes on page 3-116. 

The S bit: For LDMs that load the PC, the S bit indicates that the CPSR is loaded from 
the SPSR. For LDMs that do not load the PC and all STMs, it indicates that when 
the processor is in a privileged mode, the user-mode banked registers are 
transferred and not the registers of the current mode. 


The L bit: This bit distinguishes between a Load (L==1) and a Store (L==0) instruction. 


Addressing modes 


For full details of the addressing modes for these instructions, refer to 3.19 Load and 
Store Multiple Addressing Modes on page 3-116, and 3.20 Load and Store Multiple 
Addressing Modes (Alternative names) on page 3-121. 


Examples 
STMFD R13!, {RO - R12, LR} 
LDMFD R13!, TRO — RiZy PC} 
LDMIA RO, {RS > RE} 
STMDA R1!, t{R2, RS, RR? -— BS, REL 
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List of load and store multiple instructions 








Load multiple 

User registers load multiple 
Load multiple and restore CSPR 
Store multiple 

User registers store multiple 


page 3-41 
page 3-42 
page 3-43 
page 3-67 
page 3-68 
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3.12 | Semaphore Instructions 


The ARM instruction set has two semaphore instructions: 
* Swap (SWP) 
* Swap Byte (SWPB) 


These instructions are provided for process synchronisation. Both instructions 
generate an atomic load and store operation, allowing a memory semaphore to be 
loaded and altered without interruption. 

SWP and SWPB have a single addressing mode; the address is the contents of 

a register. Separate registers are used to specify the value to store and the destination 
of the load; if the same register is specified SWP exchanges the value in the register 
and the value in memory. 

The semaphore instructions do not provide a compare and conditional write facility; 
this must be done explicitly. 


Examples 
SWP R12, R10, [R9] ; load R12 from address R9 and 
; store R10 to address R9 


SWPB R3, R4, [R8] ; load byte to R3 from address R8 and 
; store byte from R4 to address R8 





SWP Ril, Ri, [RZ] ; Exchange value in Rl and address in R2 


3.12.1 List of semaphore instructions 


SWP Swap page 3-76 
SWPB Swap Byte page 3-77 
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3.13 | Coprocessor Instructions 


Note: Coprocessor instructions are not implemented in Architecture version 1. 


The ARM instruction set provides 3 types of instruction for communicating with 
coprocessors. The instruction set distinguishes up to 16 coprocessors with a 4-bit field 
in each coprocessor instruction, so each coprocessor is assigned a particular number 
(one coprocessor can use more than one of the 16 numbers if a large coprocessor 
instruction set is required). 


The three classes of coprocessor instruction allow: 
« ARM to initiate a coprocessor data processing operation 
« ARM registers to be transferred to and from coprocessor registers 
* ARM to generate addresses for the coprocessor load and store instructions 


Coprocessors execute the same instruction stream as ARM, ignoring ARM instructions 
and coprocessor instructions for other coprocessors. Coprocessor instructions that 
cannot be executed by coprocessor hardware cause an UNDEFINED instruction trap, 
allowing software emulation of coprocessor hardware. 


A coprocessor can partially execute an instruction and then cause an exception; this is 
useful for handling run-time-generated exceptions (like divide-by-zero or overflow). 


Not all fields in coprocessor instructions are used by ARM; coprocessor register 
specifiers and opcodes are defined by individual coprocessors. Therefore, only generic 
instruction mnemonics are provided for coprocessor instructions; assembler macros 
can be used to transform custom coprocessor mnemonics into these generic 
mnemonics (or to regenerate the opcodes manually). 


Examples 


CDP p5, 2, cl2, cl10, c3, 4; Coprocessor 5 data operation 
; opcode 1 = 2, opcode 2 = 4 
; destination register is 12 
; source registers are 10 and 3 


MRC pl5, 5, R4, c0O, c2, 3 ; Coprocessor 15 transfer to ARM register 
; opcode 1 = 5, opcode 2 = 3 
; ARM destination register = R4 
; coproc source registers are 0 and 2 


MCR pl4, 1, R7, c7, cl2, 6; ARM register transfer to Coprocessor 14 
; opcode 1 = 1, opcode 2 = 6 
; ARM source register = R7 
; coproc dest registers are 7 and 12 


LDC p6, CR1, [R4] ; Load from memory to coprocessor 6 
; ARM register 4 contains the address 
; Load to CP reg 1 


LDC p6, CR4, [R2, #4] ; Load from memory to coprocessor 6 
; ARM register R2 + 4 is the address 
; Load to CP reg 4 
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STC p8, CR8, [R2, #4]! Store from coprocessor 8 to memory 
ARM register R2 + 4 is the address 
; after the transfer R2 = R2 + 4 
; Store from CP reg 8 
STC p8, CR9, [R2], #-16 Store from coprocessor 8 to memory 
ARM register R2 holds the address 
after the transfer R2 = R2 - 16 
; Store from CP reg 9 
List of coprocessor instructions 
CDP Coprocessor data operations page 3-36 
LDC Load coprocessor register page 3-40 
MCR Move to coprocessor from ARM register page 3-51 
MRC Move to ARM register from coprocessor page 3-54 
STC Store coprocessor register page 3-66 
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3.14 Extending the Instruction Set 


The ARM instruction set can be extended in four areas: 
« Arithmetic instruction extension space 
* Control instruction extension space 
¢  Load/store instruction extension space 
*  Coprocessor instruction extension space 


Currently, these instructions are UNDEFINED (they cause an undefined instruction 
exception). These parts of the address space will be used in the future to add new 
instructions. 


3.14.1. Arithmetic instruction extension space 


Instructions with the following opcodes are the arithmetic instruction extension space: 
opcode[27:24] = 0 
opcode [7:4] = 0b1001 


The field names given are only guidelines, which are likely to simplify implementation. 


28 27 26 25 24 23 20.19 16.15 12°11 8.7 6 5 4 3 
f= pelo f= lef epeps 
MUL and MLA 
Multiply and Multiply Accumulate (MUL and MLA) instructions use op1 values 0 to 3. 

Rn specifies the destination register 
Rm and Rs _ specify source registers 
Rd specifies the accumulated value 


UMULL, UMLAL, SMULL, SMLAL 


The Signed and Unsigned Multiple Long and Multiply Accumulate Long (UMULL, 
UMLAL, SMULL, SMLAL) instructions use op1 values 8 to 15; 


Rn and Rd __ specify the destination registers 
Rm and Rs _ specify the source registers 
Other opcodes 
The meaning of all other opcodes in the arithmetic instruction extension space is: 
* UNDEFINED on ARM Architecture 4 
* UNPREDICTABLE on earlier versions of the architecture. 
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Control instruction extension space 


Instructions with: 


opcode[27:26] = 0b00 

opcode[24:23] = 0b10 

opcode [20] = 0 
and not: 

opcode [25] = 0 

opcode [7] = 1 

opcode [4] = 1 





are the control instruction extension space. The field names given are only guidelines, 
which are likely to simplify implementation. 
31 28 27 26 25 24 23 22 21 20 19 16 15 12 11 8 7 6 5 4 3 0 


Rs op2 





Rs 0| op2 Rm 











rotate_imm 8 _bit_immediate 























MRS 

The Move status register to general-purpose register (MRS) instruction sets: 
opcode [25] = 0 
opcode[19:16] = 0b1111 
opcode[11:0] = 0 


and uses both op1 = 0b00 andopl = Ob10. 

Rd is used to specify the destination register. 

MSR 

The Move general-purpose register to status register (MSR) instruction sets: 


opcode [25] = 0 
opcode[15:12] = 0b1111 
opcode [11:4] = 0 


and uses both op1 = 0b01 and op1= Ob11. 


Rm specifies the source register 
opcode [19:16] hold the write mask 
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The Move immediate value to status register (MSR) instruction sets: 
opcode [25] = 1 


opcode[15:12] = 0b1111 


and uses both op1 = 0b01 andopi= Ob11. 


Rm specifies the source register 
opcode [11:0] specify the immediate operand 
opcode [19:16] hold the write mask 
BX 
The Branch and Exchange Instruction Set (BX) instruction sets: 
opcode [25] = 0 
opcode[19:8] = 0b111111111111 
opcode [7:4] = 0b0001 


and uses op1 = Ob01. 
Rm is used to specify the source register. 
Other opcodes 
The meaning of all other opcodes in the control instruction extension space is: 
* UNDEFINED on ARM Architecture 4 
* UNPREDICTABLE on earlier versions of the architecture 
3.14.3 Load/Store instruction extension space 


Instructions with 


opcode[27:25] = 0b000 

opcode [7] = 1 

opcode [4] = 1 
and not 

opcode [24] = 0 

opcode [6:5] = 0 





are the load/store instruction extension space. 
The field names given are only guidelines, which are likely to simplify implementation. 


28.27 26 25 24 23 22 21 20 19 16.15 12 11 8 7 6 5 4 = 3 
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SWP and SWPB 
The Swap and Swap Byte (SWP and SWPB) instructions set: 


opcode[24:23] = 0b10 
opcode[21:20] = O0b00 
opcode[11:4] = 0600001001 
where: 
Rn specifies the base address 
Rd specifies the destination register 
Rm specifies the source register 
Opcode [22] indicates a byte transfer 
LDRH 
The Load Halfword (LDRH) instruction sets: 
opcode [20] = 1 
opl = Ob01 
where: 
the B bit distinguishes between a register and an immediate offset 
the P UandWbits specify the addressing mode 
Rn specifies the base register 
Rd specifies the destination register 
Rm specifies a register offset 
Rs and Rm specify an immediate offset 
LDRSH 
The Load Signed Halfword (LDRSH) instruction sets: 
opcode [20] = 1 
opl = Ob11 
where: 
the B bit distinguishes between a register and an immediate offset 
the P UandWbits specify the addressing mode 
Rn specifies the base register 
Rd specifies the destination register 
Rm specifies a register offset 
Rs and Rm specify an immediate offset 
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LDRSB 
The Load Signed Byte (LDRSB) instruction sets: 
opcode [20] = Al 
opl = 0b10 
where: 
the B bit distinguishes between a register and an immediate offset 
the P UandWbits specify the addressing mode 
Rn specifies the base register 
Rd specifies the destination register 
Rm specifies a register offset 
Rs and Rm specify an immediate offset 
STRH 
The Store Halfword (STRH) instruction sets: 
opcode [20] = 0 
opl = 0b01 
where: 
the B bit distinguishes between a register and an immediate offset 
the P UandW bits specify the addressing mode 
Rn specifies the base register 
Rd specifies the source register 
Rm specifies a register offset 
Rs and Rm specify an immediate offset 


Other opcodes 


The meaning of all other opcodes in the Load/Store instruction extension space is: 
* UNDEFINED on ARM Architecture 4 


* UNPREDICTABLE on earlier versions of the architecture 
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Coprocessor instruction extension space 


Instructions with 
opcode [27:24] 


0b1100 
opcode [21] = 0 


are the coprocessor instruction extension space. The names given to fields are only 
guidelines, which if followed are likely to simplify implementation. 


28 27 26 25 24 23 22 21 20 19 16 (15 12 11 





The meaning of instructions in the coprocessor instruction extension space is: 
* UNDEFINED on ARM Architecture 4 
* UNPREDICTABLE on earlier versions of the architecture 


Undefined instruction Space 


Instructions with 
opcode [27:25] 


Ob011 
opcode [4] a 


are UNDEFINED instruction space. 


31 28 27 26 25 24 5.4 3 0 


cond Gy ae A HOR KOH EN Be RN 


The meaning of instructions in the UNDEFINED instruction space is UNDEFINED on all 
versions of the ARM Architecture. 
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Instruction name 
given in the following alphabetical list 


Functional area 
described in the preceding section of this chapter 


Addressing mode 
indicates if an addressing mode applies to this instruction 


Architecture availability 


indicates if there is a restriction on availability 
Not all instructions are available in 
all versions of the ARM architecture 


Encoding 
specifies the bit patterns for the instruction 


Operation 
describes the operation of the instruction in pseudo-code 


Exceptions 
lists any possible exceptions 


Qualifiers and flag settings 


lists any conditions and flag settings 
that apply to the instruction 


User notes 
gives notes on using the instruction 


ARM Instructions 


Description 


Syntax 





| 









Description 


> 





|»  Adaressing mode 3 


Architecture v4 only 


— 


31 287 








y 


STR{<cond>}H Rd, <addressing_mode> 











Combined with a suitable addressing mode, the STRH 
16-bit data from a general-purpose register to be store: 
allows PC-relative addressing, to facilitate position-indi 


STRH stores a halfword from the least-significant halfv 
calculated by <addressing_mode>. If the address is 
UNPREDICTABLE. 

The instruction is only executed if the condition specific 
status. 


26 51D. 1615 1 








cond 0 


Operation 








| 


> Exceptions 


Qualifiers 
Notes 








o o}|P]u/sij|wio Rn Rd 
if ConditionPassed(<cond>) then 
if <address>[0] == 0 
<data> = Rd[15:0] 


else /* <address>[0] == 1 */ 
<data> = UNPREDICTABLE 
Memory [<address>,2] = <data> 


Data Abort 

Condition Code 

Addressing modes: The |, P, U and W bits specify the ty 
Mode 3 starting on page -108). 

The addr_mode bits: These bits are addressing-mode sy 

Register Rn: Specifies the base register used by <addr 

Use of R15: If register 15 is specified for Rd, the result i 


Operand restrictions: If <addressing_mode> uses pre 
same register is specified for Rd and Rn, the re: 
Data Abort: If a data abort is signalled and <addressi 
addressing, the value left in Rn is IMPLEMENTATIO 
value or the updated base register value (even i 


Non-half-word aligned addresses: If the store address is n 
UNPREDICTABLE. 


Alignment: If an implementation includes a System Con 
checking is enabled, an address with bit[0] != 0 








m@ POWERED 


ARM 


z 





Addressing mode 1 
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ADC{<cond>}{S} Rd, Rn, <shifter_operand> 


Description The ADC (Add with Carry) instruction adds the value of <shifter_operand> 
and the value of the Carry flag to the value of register Rn, and stores the result in 
the destination register Rd. The condition code flags are optionally updated (based 
on the result). 

ADC is used to synthesize multi-word addition. If register pairs RO,R1 and R2,R3 
hold 64-bit values (where 0 and R2 hold the least-significant words), the following 
instructions leave the 64-bit sum in R4,R5: 

ADDS R4,R0,R2 

ADC R5,R1,R3 
The instruction: 

ADCS RO,RO,RO 
produces a single-bit Rotate Left with Extend operation (33-bit rotate though 
the carry flag) on RO. See 3-97 for more information. 
The instruction is only executed if the condition specified in the instruction matches 
the condition code status. The conditions are defined in 3.3 The Condition Field on 
page 3-4. 

28 27 26 25 24 23 22 21 20 19 16 15 12 = 11 0 
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Operation 
if ConditionPassed(<cond>) then 
Rd = Rn + <shifter_operand> + C Flag 
if S == 1 and Rd == R15 then 
CPSR = SPSR 
lse if S == 1 then 
N Flag = Rd[31] 
Z Flag = if Rd == 0 then 1 else 0 
C Flag = CarryFrom(Rn + <shifter_operand> + C Flag) 
V Flag = OverflowFrom (Rn + <shifter_operand> + C Flag) 
Exceptions None 
Qualifiers Condition Code 
S updates condition code flags N,Z,C,V 
Notes Shifter operand: The shifter operands for this instruction are given in Addressing 


Mode 1 starting on page 3-84. 


The | bit: Bit 25 is used to distinguish between the immediate and register forms of 
<shifter_operand>. 


Writing to R15: When Rdis R15 and the S flag in the instruction is not set, the result 
of the operation is placed in the PC. When Rd is R15 and the S flag is set, 
the result of the operation is placed in the PC and the SPSR corresponding to 
the current mode is moved to the CPSR. This allows state changes which 
atomically restore both PC and CPSR. This form of the instruction is 
UNPREDICTABLE in User mode and System mode. 
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ADD{<cond>}{S} Rd, Rn, <shifter_operand> 


Description The ADD instruction adds the value of <shifter_operand> to the value of 
register Rn, and stores the result in the destination register Rd. The condition code 
flags are optionally updated (based on the result). 


ADD is used to add two values together to produce a third. AOI CIEU ATE Oh 


To increment a register value (in Rx), use: 
ADD Rx, Rx, #1 





Constant multiplication (of Rx) by 27+1 (into Rd) can be performed with: 
ADD Rd, Rx, Rx LSL #n 

To form a PC-relative address, use: 
ADD Rs, PC, #offset 


The instruction is only executed if the condition specified in the instruction matches 
the condition code status. The conditions are defined in 3.3 The Condition Field on 








page 3-4. 

28 27 26 25 24 23 22 21 20 19 1615 12-11 0 
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Operation if ConditionPassed(<cond>) then 

Rd = Rn + <shifter_operand> 

if == 1 and Rd == R15 then 
CPSR = SPSR 

lse if S == 1 then 

N Flag = Rd[31] 
Z Flag = if Rd == 0 then 1 else 0 
C Flag = CarryFrom(Rn + <shifter_operand>) 
V Flag = OverflowFrom (Rn + <shifter_operand>) 


Exceptions None 


Qualifiers Condition Code 
S updates condition code flags N,Z,C,V 


Notes Shifter operand: The shifter operands for this instruction are given in Addressing 
Mode 1 starting on page 3-84. 


The I bit: Bit 25 is used to distinguish between the immediate and register forms of 
<shifter_operand>. 


Writing to R15: When Rdis R15 and the S flag in the instruction is not set, the result 
of the operation is placed in the PC. When Rd is R15 and the S flag is set, 
the result of the operation is placed in the PC and the SPSR corresponding to 
the current mode is moved to the CPSR. This allows state changes which 
atomically restore both PC and CPSR. This form of the instruction is 
UNPREDICTABLE in User mode and System mode. 
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Addressing mode 1 


3-32 


AND{<cond>}{S} Rd, Rn, <shifter_operand> 


Description The AND instruction performs a bitwise AND of the value of register Rn with 
the value of <shifter_operand>, and stores the result in the destination 
register Rd. The condition code flags are optionally updated (based on the result). 


AND is most useful for extracting a field from a register, by ANDing the register 
with a mask value that has 1's in the field to be extracted, and 0's elsewhere. 


The instruction is only executed if the condition specified in the instruction matches 
the condition code status. The conditions are defined in 3.3 The Condition Field on 








page 3-4. 

28 27 26 25 24 23 22 21 20 19 1615 12°11 0 
= t= |= | = 
Operation if ConditionPassed(<cond>) then 

Rd = Rn AND <shifter_operand> 
if == 1 and Rd == R15 then 
CPSR = SPSR 
lse if S == 1 then 
N Flag = Rd[31] 
Z Flag = if Rd == 0 then 1 else 0 
C Flag = <shifter_carry_out> 


V Flag = unaffected 
Exceptions None 


Qualifiers Condition Code 
S updates condition code flags N,Z,C 


Notes Shifter operand: The shifter operands for this instruction are given in Addressing 
Mode 1 starting on page 3-84. 


The | bit: Bit 25 is used to distinguish between the immediate and register forms of 
<shifter_operand>. 


Writing to R15: When Rd is R15 and the S flag in the instruction is not set, the 
result of the operation is placed in the PC. When Rd is R15 and the S flag is 
set, the result of the operation is placed in the PC and the SPSR 
corresponding to the current mode is moved to the CPSR. This allows state 
changes which atomically restore both PC and CPSR. This form of the 
instruction is UNPREDICTABLE in User mode and System mode. 
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B{L}{<cond>} <target address> 


Description The B (Branch) and BL (Branch and Link) instructions provide both conditional and 
unconditional changes to program flow. The Branch with Link instruction is used to 
perform a subroutine call; the return from subroutine is achieved by copying the LR 
to the PC. 


B and BL cause a branch to a target address. The branch target address is 
calculated by: 


1 __ shifting the 24-bit signed (two’s complement) offset left two bits 
2 _ sign-extending the result to 32 bits 


3 adding this to the contents of the PC (which contains the address of the 
branch instruction plus 8) 


The instruction can therefore specify a branch of +/- 32Mbytes. 


In the BL variant of the instruction, the L (link) bit is set, and the address of the 
instruction following the branch is copied into the link register (R14). 


The instruction is only executed if the condition specified in the instruction matches 
the condition code status. The conditions are defined in 3.3 The Condition Field on 


page 3-4. 
31 28 27 26 25 24 23 0 
Deere 
Operation 
if ConditionPassed(<cond>) then 
if L == 1 then 


LR = address of the instruction after the branch instruction 
PC = PC + (SignExtend(<24 bit_signed_offset>) << 2) 





Exceptions None 


Qualifiers Condition Code 
L (Link) stores a return address in the LR (R14) register 
Notes Offset calculation: An assembler will calculate the branch offset address from 
the difference between the address of the current instruction and the address 


of the target (given as a program label) minus eight (because the PC holds 
the address of the current instruction plus eight). 


Memory bounds: Branching backwards past location zero and forwards over 
the end of the 32-bit address space is UNPREDICTABLE. 
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BIC{<cond>}{S} Rd, Rn, <shifter_operand> 


Description The BIC (Bit Clear) instruction performs a bitwise AND of the value of register Rn 
with the complement of the value of <shifter_operand>, and stores the result 
in the destination register Rd. The condition code flags are optionally updated 
(based on the result). 


BIC can be used to clear selected bits in a register; for each bit, BIC with 1 will clear 
the bit, BIC with 0 will leave it unchanged. 


The instruction is only executed if the condition specified in the instruction matches 
the condition code status. The conditions are defined in 3.3 The Condition Field on 








page 3-4. 
28 27 26 25 24 23 22 21 20 19 1615 12-11 0 
Operation if ConditionPassed(<cond>) then 
Rd = Rn AND NOT <shifter_operand> 
if S == 1 and Rd == R15 then 
CPSR = SPSR 
lse if S == 1 then 
N Flag = Rd[31] 
Z Flag = if Rd == 0 then 1 else 0 
C Flag = <shifter_carry_out> 
V Flag = unaffected 
Exceptions None 
Qualifiers Condition Code 


S updates condition code flags N,Z,C 


Notes Shifter operand: The shifter operands for this instruction are given in Addressing 
Mode 1 starting on page 3-84. 


Writing to R15: When Rd is R15 and the S flag in the instruction is not set, the 
result of the operation is placed in the PC. When Rd is R15 and the S flag is 
set, the result of the operation is placed in the PC and the SPSR 
corresponding to the current mode is moved to the CPSR. This allows state 
changes which atomically restore both PC and CPSR. This form of the 
instruction is UNPREDICTABLE in User mode and System mode. 


The | bit: Bit 25 is used to distinguish between the immediate and register forms of 
<shifter_operand>. 
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Description 


BX{<cond>} Rm 


The BX (Branch and Exchange instructions set) is UNDEFINED on ARM Architecture 
Version 4. On ARM Architecture Version 4T, this instruction branches and selects 
the instruction set decoder to use to decode the instructions at the branch 
destination. The branch target address is the value of register Rm. The T flag is 
updated with bit 0 of the value of register Rm. 


BX is used to branch between ARM code and THUMB code. On ARM Architecture 
4, it causes an UNDEFINED instruction exception to allow the THUMB instruction set 
to be emulated. 


The instruction is only executed if the condition specified in the instruction matches 
the condition code status. The conditions are defined in 3.3 The Condition Field on 
page 3-4. 
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Operation 


Exceptions 


Operation 


Notes 
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if ConditionPassed(<cond>) then 
T Flag = Rm[0] 
PC = Rm[31:1] << 1 


None 


Condition Code 


Transferring to THUMB: When transferring to the THUMB instruction set, bit[0] of 
PC will be cleared (set to zero), and bits[31:1] will be copied from Rm to 
the PC. 

Transferring to ARM: When transferring to the ARM instruction set, bit[0] of PC will 
be cleared (set to zero), and bits[31:1] will be copied from Rm to the PC. 
If bit{1] of Rm is set, the result is UNPREDICTABLE. 
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CDP{<cond>} p<cp#>, <opcode_1>, CRd, CRn, CRm, <opcode_2> 


Description The CDP (Coprocessor Data Processing) instruction tells the coprocessor 
specified by <cp#> to perform an operation that is independent of ARM registers 
and memory. If no coprocessors indicate that they can execute the instruction, 
an UNDEFINED instruction exception is generated. 





CDP is used to initiate coprocessor instructions that do not operate on values in 
Not in architecture v1 ARM registers or in main memory; for example, a floating-point multiply instruction 
for a floating-point accelerator coprocessor. 


The instruction is only executed if the condition specified in the instruction matches 
the condition code status. The conditions are defined in 3.3 The Condition Field on 
page 3-4. 

28 27 26 25 24 23 22 21 20 19 16 15 12 11 





Operation if ConditionPassed(<cond>) then 
Coprocessor[<cp_num>] dependent operation 

Exceptions Undefined Instruction 

Qualifiers Condition Code 


Notes Coprocessor fields: Only instruction bits[31:24], bits[11:8] and bit[4] are which 
architecture defined; the remaining fields are only recommendations, which 
if followed, will be compatible with ARM Development Systems. 
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CMN{<cond>} Rn, <shifter_operand> 


Description The CMN (Compare Negative) instruction compares an arithmetic value and 
the negative of an arithmetic value (an immediate or the value of a register) and 
sets the condition code flags so that subsequent instructions can be conditionally 
executed. Addressing mode 1 


CMN performs a comparison by adding (or subtracting the negative of) the value 
of <shifter_operand> to (from) the value of register Rn, and updates 

the condition code flags (based on the result). The comparison is the subtraction 
of the negative of the second operand from the first operand (which is the same as 
adding the two operands). 





The instruction is only executed if the condition specified in the instruction matches 
the condition code status. The conditions are defined in 3.3 The Condition Field on 


page 3-4. 
28 27 26 25 24 23 22 21 20 19 1615 12-11 0 
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Operation if ConditionPassed(<cond>) then 


<alu_out> = Rn + <shifter_operand> 
N Flag = <alu_out>[31] 
Z Flag = if <alu_out> == 0 then 1 else 0 





C Flag = CarryFrom(Rn + <shifter_operand>) 

V Flag = OverflowFrom (Rn + <shifter_operand>) 
Exceptions None 
Qualifiers Condition Code 
Notes Shifter operand: The shifter operands for this instruction are given in Addressing 


Mode 1 starting on page 3-84. 


The I bit: Bit 25 is used to distinguish between the immediate and register forms of 
<shifter_operand>. 
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Description 


CMP{<cond>} Rn, <shifter_operand> 


The CMP (Compare) instruction compares two arithmetic values and sets 

the condition code flags so that subsequent instructions can be conditionally 
executed. The comparison is a subtraction of the second operand from the first 
operand. 


CMP performs a comparison by subtracting the value of <shifter_operand> 
from the value of register Rn and updates the condition code flags (based on 
the result). 


The instruction is only executed if the condition specified in the instruction matches 
the condition code status. The conditions are defined in 3.3 The Condition Field on 
page 3-4. 


2827 26 25 = 24 23-22 21 20-19 16.15 12041 0 
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Operation 


Exceptions 
Qualifiers 


Notes 


if ConditionPassed(<cond>) then 





<alu_out> = Rn - <shifter_operand> 

N Flag = <alu_out>[31] 

Z Flag = if <alu_out> == 0 then 1 else 0 

C Flag = NOT BorrowFrom(Rn — <shifter_operand>) 
V Flag = OverflowFrom (Rn - <shifter_operand>) 


None 
Condition Code 


Shifter operand: The shifter operands for this instruction are given in Addressing 
Mode 1 starting on page 3-84. 


The | bit: Bit 25 is used to distinguish between the immediate and register forms of 
<shifter_operand>. 
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EOR{<cond>}{S} Rd, Rn, <shifter_operand> 


Description The EOR (Exclusive-OR) instruction performs a bitwise Exclusive-OR of the value 
of register Rn with the value of <shifter_operand>, and stores the result in the 
destination register Rd. The condition code flags are optionally updated (based on 
the result). Addressing mode 1 
EOR can be used to invert selected bits in a register; for each bit, EOR with 1 will 
invert that bit, and EOR with 0 will leave it unchanged. 


The instruction is only executed if the condition specified in the instruction matches 
the condition code status. The conditions are defined in 3.3 The Condition Field on 














page 3-4. 

28 27 26 25 24 23 22 21 20 19 1615 12-11 0 
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Operation if ConditionPassed(<cond>) then 

Rd = Rn EOR <shifter_operand> 
if S == 1 and Rd == R15 then 
CPSR = SPSR 
lse if S == 1 then 
N Flag = Rd[31] 
Z Flag = if Rd == 0 then 1 else 0 
C Flag = <shifter_carry_out> 
V Flag = unaffected 


Exceptions None 


Qualifiers Condition Code 
S updates condition code flags N,Z,C 


Notes Shifter operand: The shifter operands for this instruction are given in Addressing 
Mode 1 starting on page 3-84. 


Writing to R15: When Rd is R15 and the S flag in the instruction is not set, 
the result of the operation is placed in the PC. When Rd is R15 and the S flag 
is set, the result of the operation is placed in the PC and the SPSR 
corresponding to the current mode is moved to the CPSR. This allows state 
changes which atomically restore both PC and CPSR. This form of the 
instruction is UNPREDICTABLE in User mode and System mode. 


The I bit: Bit 25 is used to distinguish between the immediate and register forms of 
<shifter_operand>. 
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3-40 


LDC{<cond>} p<cp_num>, CRd, <addressing_mode> 


Description The LDC (Load Coprocessor) instruction is useful to load coprocessor data from 
memory. The N bit could be used to distinguish between a single- and 
double-precision transfer for a floating-point load instruction. 


LDC loads memory data from the sequence of consecutive memory addresses 
calculated by <addressing_mode> to the coprocessor specified by <cp_num>. 
If no coprocessors indicate that they can execute the instruction, an UNDEFINED 
instruction exception is generated. 


The instruction is only executed if the condition specified in the instruction matches 
the condition code status. The conditions are defined in 3.3 The Condition Field on 





page 3-4. 
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Operation if ConditionPassed(<cond>) then 

<address> = <start_address> 

while (NotFinished (coprocessor [<cp_num>] ) ) 
Coprocessor[<cp_num>] = Memory[<address>, 4] 
<address> = <address> + 4 

assert <address> == <end_address> 


Exceptions Undefined Instruction; Data Abort 
Qualifiers Condition Code 


Notes Addressing mode: The P, U and W bits specify the <addressing_mode>. 
See Addressing Mode 5 starting on page 3-123. 


The N bit: This bit is coprocessor-dependent. It can be used to distinguish 
between two sizes of data to transfer. 


Register Rn: Specifies the base register used by <addressing_mode>. 


Coprocessor fields: Only instruction bits[31:23], bits [21:16} and bits[11:0] are 
ARM architecture-defined; the remaining fields (bit[22] and bits[15:12]) are 
recommendations for compatibility with ARM Development Systems. 


Data Abort: If a data abort is signalled and <addressing_mode> uses 
pre-indexed or post-indexed addressing, the value left in Rn is 
IMPLEMENTATION DEFINED, but is either the original base register value or 
the updated base register value. 


Non-word-aligned addresses: Load coprocessor register instructions ignore 
the least-significant two bits of <address> (the words are not rotated as for 
load word). 

Alignment: If an implementation includes a System Control Coprocessor 
(see Chapter 7), and alignment checking is enabled, an address with 
bits[1:0] != 0b00 will cause an alignment exception. 


ARM Architecture Reference Manual 
ARM DUI 0100B 


Ml POWERED 


ARM 


2 


LDM{<cond>}<addressing_mode> Rn{!}, <registers> 


Description This form of the LDM (Load Multiple) instruction is useful as a block load instruction 
(combined with store multiple it allows efficient block copy) and for stack 
operations, including procedure exit, to restore saved registers, load the PC with 
the return address, and update the stack pointer. Addressing mode 4 


In this case, LDM loads a non-empty subset (or possibly all) of the general- 
purpose registers from sequential memory locations. The registers are loaded in 
sequence, the lowest-numbered register first, from the lowest memory address 
(<start_addr>); the highest-numbered register last, from the highest memory 
address (<end_addr>). If the PC is specified in the register list (opcode bit 15 is 
set), the instruction causes a branch to the address (data) loaded into the PC. 





The instruction is only executed if the condition specified in the instruction matches 
the condition code status. The conditions are defined in 3.3 The Condition Field on 


page 3-4. 
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Operation if ConditionPassed(<cond>) then 
<address> = <start_addr> 


for i= 0to 15 
if <register_list>[i] == 
Ri = Memory [<address>, 4] 
<address> = <address> + 4 
assert <end_add> == <address> 4 





Exceptions Data Abort 


Qualifiers Condition Code 
! sets the W bit, causing base register update 


Notes Addressing mode: The P, U and W bits distinguish between the different types of 
addressing mode. See Addressing Mode 4 starting on page 3-116. 


Register Rn: Specifies the base register used by <addressing_mode>. 
Use of R15: Using R15 as the base register Rn gives an UNPREDICTABLE result. 


Operand restrictions: If the base register Rn is specified in <register_list>, 
and writeback is specified, the final value of Rn is UNPREDICTABLE. 


Data Abort: If a data abort is signalled and <addressing_mode> specifies 
writeback, the value left in Rn is IMPLEMENTATION DEFINED, but is either 
the original base register value or the updated base register value (even if Rn 
is specified in <register_list>). If register 15 is specified in 
<register_list>, it must not be overwritten if a data abort occurs. 


Non-word-aligned addresses: Load multiple instructions ignore the least-significant 
two bits of <address> (the words are not rotated as for load word). 
Alignment: If an implementation includes a System Control Coprocessor 
(see Chapter 7), and alignment checking is enabled, an address with 
bits[1:0] != 0b00 will cause an alignment exception. 
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LDM{<cond>}<addressing_mode> Rn, <registers>%* 


Description This form of the LDM (Load Multiple) instruction loads user mode registers when 
the processor is in a privileged mode (useful when performing process swaps). 





In this case, LDM instruction loads a non-empty subset (or possibly all except 
the PC) of the user mode general-purpose registers (which are also the system 
mode general-purpose registers) from sequential memory locations. The registers 
are loaded in sequence, the lowest-numbered register first, from the lowest 
memory address (<start_address>); the highest-numbered register last, from 
the highest memory address (<end_address>). 


The instruction is only executed if the condition specified in the instruction matches 
the condition code status. The conditions are defined in 3.3 The Condition Field on 
page 3-4. 
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Operation if ConditionPassed(<cond>) then 
<address> = <start_address> 
for i= 0to 14 
if <register_list>[i] == 


Addressing mode 4 


Ri_usr = Memory[<address>, 4] 
<address> = <address> + 4 
assert <end_address> == <address> 4 





Exceptions Data Abort 

Qualifiers Condition Code 

Notes Addressing mode: The P and U bits distinguish between the different types of 
addressing mode. See Addressing Mode 4 starting on page 3-116. 


Banked registers: LDM must not be followed by an instruction which accesses 
banked registers (a following NOP is a good way to ensure this). 


Writeback: Setting bit 21 (the W bit) has UNPREDICTABLE results. 
User and System mode: LDM is UNPREDICTABLE in user mode or system mode. 
Register Rn: Specifies the base register used by <addressing_mode>. 


Use of R15: If register 15 if specified as the base register Rn, the result is 
UNPREDICTABLE. 


Base register mode: The base register is read from the current processor mode 
registers, not the user mode registers. 


Data Abort: If a data abort is signalled, the value left in Rn is the original base 
register value. 

Non-word-aligned addresses: LDM instructions ignore the least-significant two bits 
of <address> (words are not rotated as for load word). 

Alignment: If an implementation includes a System Control Coprocessor 
(see Chapter 7), and alignment checking is enabled, an address with 
bits[1:0] != 0b00 will cause an alignment exception. 
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LDM{<cond>}<addressing mode> Rn{!}, <registers_and_pc>%* 


Description This form of the LDM (Load Multiple) instruction is useful for returning from 
an exception, to restore saved registers, load the PC with the return address, 
update the stack pointer, and restore the CPSR from the SPSR. 


In this case, LDM loads a non-empty subset (or possibly all) of the general- 
purpose registers and the PC from sequential memory locations. The registers are 
loaded in sequence, the lowest-numbered register first, from the lowest memory 
address; the highest-numbered register last, from the highest memory address. 
The SPSR of the current mode is copied to the CPSR. 
The instruction is only executed if the condition specified in the instruction matches 
the condition code status. The conditions are defined in 3.3 The Condition Field on 
page 3-4. 
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Operation if ConditionPassed(<cond>) then 
<address> = <start_address> 
for i= 0 to 15 
if <register_list>[i] == 1 

Ri = Memory [<address>, 4] 
<address> = <address> + 4 

assert <end_address> == <address> 4 

CPSR = SPSR 





Addressing mode 4 








Exceptions Data Abort 


Qualifiers Condition Code 
! sets the W bit, causing base register update 


Notes Addressing mode: The P, U and W bits distinguish between the different types of 
addressing mode. See Addressing Mode 4 starting on page 3-116. 
Register Rn: Specifies the base register used by <addressing_mode>. 
Use of R15: Using R15 as the base register Rn gives an UNPREDICTABLE result. 
User and System mode: This instruction is UNPREDICTABLE in user or system mode. 


Operand restrictions: If the base register Rn is specified in <register_list>, 
and writeback is specified, the final value of Rn is UNPREDICTABLE. 


Data Abort: If a data abort is signalled and <addressing_mode> specifies 
writeback, the value left in Rn is IMPLEMENTATION DEFINED, but is either 
the original base register value or the updated base register value (even if Rn 
is specified in <register_list>). If register 15 is specified in 
<register_list>, it must not be overwritten if a data abort occurs. 
Non-word-aligned addresses: Load multiple instructions ignore the least-significant 
two bits of <address> (the words are not rotated as for load word). 


Alignment: If an implementation includes a System Control Coprocessor 
(see Chapter 7), and alignment checking is enabled, an address with 
bits[1:0] != 0b00 will cause an alignment exception. 
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Addressing mode 2 


3-44 


Description 


LDR{<cond>} Rd, <addressing_mode> 


Combined with a suitable addressing mode, the LDR (Load register) instruction 
allows 32-bit memory data to be loaded into a general-purpose register where its 
value can be manipulated. If the destination register is the PC, this instruction 
loads a 32-bit address from memory and branches to that address (precede 

the LDR instruction with MOV LR, PC to synthesize a branch and link). 


Using the PC as the base register allows PC-relative addressing, to facilitate 
position-independent code. 


LDR loads a word from the memory address calculated by <addressing_mode> 
and writes it to register Rd. If the address is not word-aligned, the loaded data is 
rotated so that the addressed byte occupies the least-significant byte of 

the register. If the PC is specified as register Rd, the instruction loads a branch to 
the address (data) into the PC. 


The instruction is only executed if the condition specified in the instruction matches 
the condition code status. The conditions are defined in 3.3 The Condition Field on 
page 3-4. 
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Operation 


Exceptions 
Qualifiers 


Notes 


ERD OSo CaN cote sees = 


if ConditionPassed(<cond>) then 
if <address>[1:0] == 0b00 

Rd = Memory [<address>, 4 
else if <address>[1:0] == O0b01 
Rd = Memory[<address>,4] Rotate_Right 8 
else if <address>[1:0] == 0b10 
Rd = Memory[<address>,4] Rotate_Right 16 
else /* <address>[1:0] == Ob11 */ 

Rd = Memory[<address>,4] Rotate_Right 24 




















Data Abort 

Condition Code 

Addressing modes: The I, P, U and W bits specify the type of 
<addressing_mode> (see Addressing Mode 2 starting on page 3-98). 

Register Rn: Specifies the base register used by <addressing_mode>. 


Data Abort: If a data abort is signalled and <addressing_mode> uses pre- 
indexed or post-indexed addressing, the value left in Rn is IMPLEMENTATION 
DEFINED, but is either the original base register value or the updated base 
register value (even if the same register is specified for Rd and Rn). 


Operand restrictions: If <addressing_mode> uses pre-indexed or post-indexed 
addressing, and the same register is specified for Rd and Rn, the results are 
UNPREDICTABLE. 

Alignment: If an implementation includes a System Control Coprocessor 
(See Chapter 7), and alignment checking is enabled, an address with 
bits[1:0] != 0b00 will cause an alignment exception. 
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LDR{<cond>}B Rd, <addressing_mode> 


Description Combined with a suitable addressing mode, the LDRB (Load Register Byte) 
instruction allows 8-bit memory data to be loaded into a general-purpose register 
where it can be manipulated. Using the PC as the base register allows PC-relative 
addressing, to facilitate position-independent code. Addressing mode 2 


LDRB loads a byte from the memory address calculated by 
<addressing_mode>, zero-extends the byte to a 32-bit word, and writes 
the word to register Rd. 





The instruction is only executed if the condition specified in the instruction matches 
the condition code status. The conditions are defined in 3.3 The Condition Field on 
page 3-4. 
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Operation if ConditionPassed(<cond>) then 
Rd = Memory [<address>,1] 


Exceptions Data Abort 
Qualifiers Condition Code 


Notes Addressing modes: The |, P, U and W bits specify the type of 
<addressing_mode> (see Addressing Mode 2 starting on page 3-98). 


Register Rn: Specifies the base register used by <addressing_mode>. 

Use of R15: If register 15 is specified for Rd the result is UNPREDICTABLE. 

Operand restrictions: If <addressing_mode> uses pre-indexed or post-indexed 
addressing, and the same register is specified for Rd and Rn, the results are 
UNPREDICTABLE. 


Non-word-aligned addresses: Store Word instructions ignore the least-significant 
two bits of <address> (the words are not rotated as for Load Word). 

Data Abort: If a data abort is signalled and <addressing_mode> uses 
pre-indexed or post-indexed addressing, the value left in Rn is 
IMPLEMENTATION DEFINED, but is either the original base register value or 
the updated base register value (even if the same register is specified for Rd 
and Rn). 
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Addressing mode 2 


3-46 


Description 


LDR{<cond>}BT Rd, <post_indexed_addressing_mode> 


The LDRBT (Load Register Byte with Translation) instruction can be used by 

a (privileged) exception handler that is emulating a memory access instruction that 
would normally execute in User Mode. The access is restricted as if it has 

User Mode privilege. 


LDRBT loads a byte from the memory address calculated by 
<post_indexed_addressing_mode>, zero-extends the byte to a 32-bit word, 
and writes the word to register Rd. If the instruction is executed when 

the processor is in a privileged mode, the memory system is signalled to treat 
the access as if the processor was in user mode. 


The instruction is only executed if the condition specified in the instruction matches 
the condition code status. The conditions are defined in 3.3 The Condition Field on 
page 3-4. 
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Operation 


Exceptions 
Qualifiers 


Notes 





if ConditionPassed(<cond>) then 
Rd = Memory[<address>,1] 


Data Abort 
Condition Code 
Addressing modes: The I, P, and U bits specify the type of <addressing_mode> 


(see Addressing Mode 2 starting on page 3-98). 


Register Rn: Specifies the base register used by 
<post_indexed_addressing_mode>. 


User mode: If this instruction is executed in user mode, an ordinary user mode 
access is performed. 


Use of R15: If register 15 is specified for Rd, the result is UNPREDICTABLE. 


Operand restrictions: If the same register is specified for Rd and Rn, the results 
are UNPREDICTABLE. 


Data Abort: If a data abort is signalled, the value left in Rn is IMPLEMENTATION 
DEFINED, but is either the original base register value or the updated base 
register value (even if the same register is specified for Rd and Rn). 
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LDR{<cond>}H Rd, <addressing_mode> 


Description Used with a suitable addressing mode, the LDRH (Load Register Halfword) 
instruction allows 16-bit memory data to be loaded into a general-purpose register 
where its value can be manipulated. 





Addressing mode 3 


Using the PC as the base register allows PC-relative addressing to facilitate 
position-independent code. 


LDRH loads a halfword from the memory address calculated by Architecture v4 only 
<addressing_mode>, zero-extends the halfword to a 32-bit word, and writes 

the word to register Rd. If the address is not halfword-aligned, the result is 

UNPREDICTABLE. 


The instruction is only executed if the condition specified in the instruction matches 
the condition code status. The conditions are defined in 3.3 The Condition Field on 
page 3-4. 
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Operation if ConditionPassed(<cond>) then 
if <address>[0] == 
<data> = Memory[<address>, 2] 
else /* <address>[0] == 1 */ 
<data> = UNPREDICTABLE 
Rd = <data> 


Exceptions Data Abort 
Qualifiers Condition Code 
Notes Addressing modes: The |, P, U and W bits specify the type of 
<addressing_mode> (see Addressing Mode 3 starting on page 3-109). 
The addr_mode bits: These bits are addressing-mode specific. 
Register Rn: Specifies the base register used by <addressing_mode>. 
Use of R15: If register 15 is specified for Rd, the result is UNPREDICTABLE. 


Operand restrictions: If <addressing_mode> uses pre-indexed or post-indexed 
addressing, and the same register is specified for Rd and Rn, the results are 
UNPREDICTABLE. 


Data Abort: If a data abort is signalled and <addressing_mode> uses pre- 
indexed or post-indexed addressing, the value left in Rn is IMPLEMENTATION 
DEFINED, but is either the original base register value or the updated base 
register value (even if the same register is specified for Rd and Rn). 


Non-half-word aligned addresses: If the load address is not halfword-aligned, 
the loaded value is UNPREDICTABLE. 

Alignment: If an implementation includes a System Control Coprocessor 
(see Chapter 7), and alignment checking is enabled, an address with 
bit[O] != 0 will cause an alignment exception. 
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Addressing mode 3 


Architecture v4 only 


3-48 


Description 


LDR{<cond>}SB Rd, <addressing_mode> 


Used with a suitable addressing mode, the LDRSB (Load Register Signed Byte) 
instruction allows 8-bit signed memory data to be loaded into a general-purpose 
register where it can be manipulated. 


Using the PC as the base register allows PC-relative addressing, to facilitate 
position-independent code. 


LDRSB loads a byte from the memory address calculated by 
<addressing_mode>, sign extends the byte to a 32-bit word, and writes the word 
to register Rd. 


The instruction is only executed if the condition specified in the instruction matches 
the condition code status. The conditions are defined in 3.3 The Condition Field on 
page 3-4. 
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Operation 


Exceptions 
Qualifiers 


Notes 


if ConditionPassed(<cond>) then 
<data> = Memory[<address>,1] 
Rd = SignExtend (<data>) 





Data Abort 

Condition Code 

Addressing modes: The |, P, U and W bits specify the type of 
<addressing_mode> (see Addressing Mode 3 starting on page 3-109). 

The addr_mode bits: These bits are addressing mode specific. 

Register Rn: Specifies the base register used by <addressing_mode>. 

Use of R15: If register 15 is specified for Rd, the result is UNPREDICTABLE. 


Operand restrictions: If <addressing_mode> uses pre-indexed or post-indexed 
addressing, and the same register is specified for Rd and Rn, the results are 
UNPREDICTABLE. 

Data Abort: If a data abort is signalled and <addressing_mode> uses 
pre-indexed or post-indexed addressing, the value left in Rn is 
IMPLEMENTATION DEFINED, but is either the original base register value or 
the updated base register value (even if the same register is specified for 
Rd and Rn). 
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LDR{<cond>}SH Rd, <addressing_mode> 


Description Used with a suitable addressing mode, the LDRSH (Load Register Signed 
Halfword) instruction allows 16-bit signed memory data to be loaded into 
a general-purpose register where its value can be manipulated. 


Using the PC as the base register allows PC-relative addressing, to facilitate 
position-independent code. 


LDRSH loads a halfword from the memory address calculated by 
<addressing_mode>, sign-extends the halfword to a 32-bit word, and writes 
the word to register Rd. If the address is not halfword-aligned, the result is 
UNPREDICTABLE. 


The instruction is only executed if the condition specified in the instruction matches 
the condition code status. The conditions are defined in 3.3 The Condition Field on 
page 3-4. 
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Operation if ConditionPassed(<cond>) then 
if <address>[0] == 
<data> = Memory[<address>, 2] 
else /* <address>[0] == 1 */ 
<data> = UNPREDICTABLE 
Rd = SignExtend (<data>) 





Exceptions Data Abort 
Qualifiers Condition Code 


Notes Addressing modes: The |, P, U and W bits specify the type of 
<addressing_mode> (see Addressing Mode 3 starting on page 3-109). 


The addr_mode bits: These bits are addressing mode specific. 

Register Rn: Specifies the base register used by <addressing_mode>. 

Use of R15: If register 15 is specified for Rd, the result is UNPREDICTABLE. 

Operand restrictions: If <addressing_mode> uses pre-indexed or post-indexed 
addressing, and the same register is specified for Rd and Rn, the results are 
UNPREDICTABLE. 

Data Abort: If a data abort is signalled and <addressing_mode> uses 
pre-indexed or post-indexed addressing, the value left in Rn is 
IMPLEMENTATION DEFINED, but is either the original base register value or 
the updated base register value (even if the same register is specified for 
Rd and Rn). 


Non-half-word aligned addresses: If the load address is not halfword-aligned, 
the loaded value is UNPREDICTABLE. 


Alignment: If an implementation includes a System Control Coprocessor 
(see Chapter 7), and alignment checking is enabled, an address with 
bit[0] != 0 causes an alignment exception. 
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Addressing mode 3 


Architecture v4 only 


3-49 


LDR{<cond>}T Rd, <post_indexed_addressing_mode> 


Description The LDRT (Load Register with Translation) instruction can be used by 
a (privileged) exception handler that is emulating a memory access instruction that 
would normally execute in User Mode. The access is restricted as if it has 
Addressing mode 2 User Mode privilege. 


LDRT loads a word from the memory address and writes it to register Rd. 

If the instruction is executed when the processor is in a privileged mode, 

the memory system is signalled to treat the access as if the processor was in user 
mode. 





The instruction is only executed if the condition specified in the instruction matches 
the condition code status. The conditions are defined in 3.3 The Condition Field on 

















page 3-4. 
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addressing mode specific 
Operation if ConditionPassed(<cond>) then 
if <address>[1:0] == 0b00 
Rd = Memory[<address>, 4 
else if <address>[1:0] == O0b01 
Rd = Memory[<address>,4] Rotate_Right 8 
else if <address>[1:0] == 0b10 
Rd = Memory[<address>,4] Rotate_Right 16 
else /* <address>[1:0] == Ob11 */ 
Rd = Memory[<address>,4] Rotate_Right 24 








Exceptions Data Abort 

Qualifiers Condition Code 

Notes Addressing modes: The I, P, and U bits specify the type of <addressing_mode> 
(see Addressing Mode 2 starting on page 3-98). 


Register Rn: Specifies the base register used by 
<post_indexed_addressing_mode>. 

User mode: If this instruction is executed in user mode, an ordinary user mode 
access is performed. 


Operand restrictions: If the same register is specified for Rd and Rn the results are 
UNPREDICTABLE. 

Data Abort: If data abort is signalled, the value left in Rn is IMPLEMENTATION 
DEFINED, but is either the original base register value or the updated base 
register value (even if the same register is specified for Rd and Rn). 

Alignment: If an implementation includes a System Control Coprocessor 
(See Chapter 7), and alignment checking is enabled, an address with 
bits[1:0] != 0b00 causes an alignment exception. 
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MCR{<cond>} p<cp#>, <opcode_1>, Rd, CRn, CRm, <opcode_2> 


Description The MCR (Move to Coprocessor from ARM Register) instruction is used to initiate 
coprocessor instructions that operate on values in ARM registers, for example 
a fixed-point to floating-point conversion instruction for a floating-point accelerator 
coprocessor. 





MCR passes the value of register Rd to the coprocessor specified by <cp_num>. 
If no coprocessors indicate that they can execute the instruction, an UNDEFINED Not in architecture v1 
instruction exception is generated. 


The instruction is only executed if the condition specified in the instruction matches 
the condition code status. The conditions are defined in 3.3 The Condition Field on 





page 3-4. 
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Operation if ConditionPassed(<cond>) then 
Coprocessor[<cp_num>] = Rd 


Exceptions Undefined Instruction 


Qualifiers Condition Code 


Notes Coprocessor fields: Only instruction bits[31:24], bit[20], bits[15:8] and bit[4] are 
ARM architecture-defined; the remaining fields are only recommendations, for 
compatibility with ARM Development Systems. 
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MLA{<cond>}{<S>} Rd, Rm, Rs, Rn 


Description The MLA (Multiply Accumulate) instruction multiplies signed or unsigned operands 
to produce a 32-bit result, which is then added to a third operand, and written to 
the destination register. 





MLA multiplies the value of register Rm with the value of register Rs, adds 
the value of register Rn, and stores the result in the destination register Rd. 
Not in architecture v1 The condition code flags are optionally updated (based on the result). 
The instruction is only executed if the condition specified in the instruction matches 
the condition code status. The conditions are defined in 3.3 The Condition Field on 
page 3-4. 
28 27 26 25 24 23 22 21 20 19 16 15 12 = 11 


Operation if ConditionPassed(<cond>) then 
Rd = (Rm * Rs + Rn) [31:0] 
if S$ == then 
N Flag = Rd[31] 
Flag = if Rd == 0 then 1 else 0 
Flag UNPREDICTABLE 
Flag = unaffected 





<QN 


Exceptions None 


Qualifiers Condition Code 
S updates condition code flags N and Z 
Notes Use of R15: Specifying R15 for register Rd, Rm, Rs or Rn has UNPREDICTABLE 
results. 


Operand restriction: Specifying the same register for Rd and Rm has 
UNPREDICTABLE results. 


Early termination: If the multiplier implementation supports early termination, 
it must be implemented on the value of the Rs operand. The type of early 
termination used (signed or unsigned) is IMPLEMENTATION DEFINED. 


Signed and unsigned: Because the MLA instruction produces only the lower 
32 bits of the 64-bit product, MLA gives the same answer for multiplication of 
both signed and unsigned numbers. 


3-52 ARM Architecture Reference Manual 


ARM DUI 0100B 


Ml POWERED 


ARM 


2 


MOV{<cond>}{S} Rd, <shifter_operand> 


Description The MOV (Move) instruction is used to: 
* move a value from one register to another 





* put a constant value into a register 
* perform a shift without any other arithmetic or logical operation 


Addressing mode 1 


When the PC is the destination of the instruction, a branch occurs, and 

MOV PC, LRCan be used to return from a subroutine call (see the B and BL 
instructions) and to return from some types of exception (See 2.5 Exceptions on 
page 2-6). 

MOV moves the value of <shifter_operand> to the destination register Rd, 
and optionally updates the condition code flags (based on the result). 


The instruction is only executed if the condition specified in the instruction matches 
the condition code status. The conditions are defined in 3.3 The Condition Field on 








page 3-4. 
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Operation if ConditionPassed(<cond>) then 

Rd = <shifter_operand> 
if S == and Rd == R15 then 
CPSR = SPSR 
lse if S == 1 then 
N Flag = Rd[31] 
Z Flag = if Rd == 0 then 1 else 0 


C Flag = <shifter_carry_out> 
V Flag = unaffected 


Exceptions None 


Qualifiers Condition Code 
S updates condition code flags N,Z and C 


Notes Shifter operand: The shifter operands for this instruction are given in Addressing 
Mode 1 starting on page 3-84. 


Writing to R15: When Rdis R15 and the S flag in the instruction is not set, the result 
of the operation is placed in the PC. When Rd is R15 and the S flag is set, 
the result of the operation is placed in the PC and the SPSR corresponding to 
the current mode is moved to the CPSR. This allows state changes which 
atomically restore both PC and CPSR. This form of the instruction is 
UNPREDICTABLE in User mode and System mode. 


The | bit: Bit 25 is used to distinguish between the immediate and register forms of 
<shifter_operand>. 
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Addressing mode 4 


Description 


MRC{<cond>} p<cp#>, <opcode_1>, Rd, CRn, CRm, <opcode_2> 


The MRC (Move to ARM Register from Coprocessor) instruction is used to initiate 
coprocessor instructions that return values to ARM registers, for example 

a floating-point to fixed-point conversion instruction for a floating-point accelerator 
coprocessor. 


Specifying R15 as the destination register is useful for operations like 
a floating-point compare instruction. 


MRC has two uses: 


1 If Rd specifies register 15, the condition code flags bits are updated from 
the top four bits of the value from the coprocessor specified by <cp_num> 
(to allow conditional branching on the status of a coprocessor) and the other 
28 bits are IGNORED. 


2 Otherwise the instruction writes into register Rd a value from the coprocessor 
specified by <cp#>. 

If no coprocessors indicate that they can execute the instruction an UNDEFINED 

instruction exception is generated. 

The instruction is only executed if the condition specified in the instruction matches 


the condition code status. The conditions are defined in 3.3 The Condition Field on 
page 3-4. 


28 27 26 25 24 23 21.20 19 16.15 12,11 





Operation 


Exceptions 
Qualifiers 


Notes 


if ConditionPassed(<cond>) then 


if Rd == 15 then 
N flag = (value from Coprocessor[<cp_num>]) [31] 
Z flag = (value from Coprocessor[<cp_num>]) [30] 
C flag = (value from Coprocessor[<cp_num>]) [29] 
V flag = (value from Coprocessor[<cp_num>]) [28] 
else /* Rd != 15 */ 


Rd = value from Coprocessor[<cp_num>] 
Undefined Instruction 
Condition Code 


Coprocessor fields: Only instruction bits[31:24], bit[20], bits[15:8] and bit[4] are 
ARM architecture-defined; the remaining fields are only recommendations, 
that if followed, will be compatible with ARM Development Systems. 
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MRS{<cond>} Rd, CPSR 
MRS{<cond>} Rd, SPSR 


Description The MRS instruction moves the value of the CPSR or the SPSR of the current 
mode into a general-purpose register. In the general-purpose register, the value 
can be examined or manipulated with normal data-processing instructions. 


The MRS moves the value of the CPSR, or the value of the SPSR corresponding 
to the current mode, to a general-purpose register. 





The instruction is only executed if the condition specified in the instruction matches 
the condition code status. The conditions are defined in 3.3 The Condition Field on 





page 3-4. 
31 28 27 26 25 24 23 22 21 20 19 16 15 12 11 0 
Operation if ConditionPassed(<cond>) then 
if R == 1 then 
Rd = SPSR 
else 
Rd = CPSR 
Exceptions None 
Qualifiers Condition Code 
Notes Opcode [11:0]: Execution of MRS instructions with any non-zero bits in 


opcode[1 1:0] is UNPREDICTABLE. 


Opcode [19:16]: Execution of MRS instructions with any non-one bits in 
opcode[1 9:16] is UNPREDICTABLE. 


User mode SPSR: Accessing the SPSR when in user mode or system mode is 
UNPREDICTABLE. 
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MSR{<cond>} CPSR_f, #32bit immediate 
MSR{<cond>} CPSR_<fields>,Rm 
MSR{<cond>} SPSR_f, #32bit immediate 
MSR{<cond>} SPSR_<fields>, Rm 


Description The MSR (Move to Status register from ARM Register) instruction transfers 
the value of a general-purpose register to the CPSR or the SPSR of the current 
mode. This is used to update the value of the condition code flags, interrupt 
enables, or the processor mode. 


MSR moves the value of Rm or the value of the 32-bit immediate (encoded as 
an 8-bit value with rotate) to the CPSR or the SPSR corresponding to the current 
mode. 


The instruction is only executed if the condition specified in the instruction matches 
the condition code status. The conditions are defined in 3.3 The Condition Field on 
page 3-4. 
ane operand 
28 27 26 25 24 23 22 21 20 19 16 15 12 11 





pega! operand 
28 27 26 25 24 23 22 21 20 19 1615 12011 


Operation 


if ConditionPassed(<cond>) then 
if opcode[25] == 





























<operand> = <8_bit_immediate> Rotate_Right (<rotate_imm> * 2) 
else /* opcode[25] == 0 */ 
<operand> 
if R == 0 then 
if <field_mask>[0] == 1 and InAPrivilegedMode() then 
CPSR[7:0] = <operand>[7:0] 
if <field_mask>[1] == 1 and InAPrivilegedMode() then 
CPSR[15:8] = <operand>[15:8] 
if <field_mask>[2] == 1 and InAPrivilegedMode() then 
CPSR[23:16] = <operand>[23:16] 
if <field_mask>[3] == 1 then 
CPSR[31:24] = <operand>[31:24] 
else /* R == 1 */ 
if <field_mask>[0] == 1 and CurrentModeHasSPSR() then 
SPSR[7:0] = <operand>[7:0] 
if <field_mask>[1] == 1 and CurrentModeHasSPSR() then 
SPSR[15:8] = <operand>[15:8] 
if <field_mask>[2] == 1 and CurrentModeHasSPSR() then 
SPSR[23:16] = <operand>[23:16] 
if <field_mask>[3] == 1 and CurrentModeHasSPSR() then 
SPSR[31:24] = <operand>[31:24] 
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Exceptions 


Qualifiers 


Notes 


> 
P] 
=< 
my Ml POWERED 


None 


Condition Code 
<fields> is one of 
sets the control field mask bit (bit 0) 





Architecture v3 


_x sets the extension field mask bit (bit 1) and v4 only 
_s sets the status field mask bit (bit 2) 
nat sets the flags field mask bit (bit 3) 


Immediate Operand: The immediate form of this instruction can only be used to set 
the flag bits (PSR bits 31:24). Using the immediate form on any other fields 
has UNPREDICTABLE results. 

PSR Update: The value of a PSR must be updated by moving the PSR to 
a general-purpose register (using the MRS instruction), modifying 
the relevant bits of the general-purpose register, and restoring the updated 
general-purpose register value back into the PSR (using the MSR instruction). 

User Mode CPSR: Any writes to CPSR[23:0] in user mode are IGNORED (so that 
user mode programs cannot change to a privileged mode). 


User mode SPSR: Accessing the SPSR when in user mode is UNPREDICTABLE. 
System mode SPSR: Accessing the SPSR when in system mode is 
UNPREDICTABLE. 


Deprecated field specification: The CPSR, CPSR_flg, CPSR_ctl, CPSR_all, 
SPSR, SPSR_flg, SPSR_ctl and SPSR_all forms of PSR field specification 
have been superseded by the csxf format shown above. 

CPSR, SPSR, CPSR_all and SPSR_all produce a field mask of 061001. 
CPSR_flg and SPSR_flg produce a field mask of 061000. 
CPSR_ctl and SPSR_ctl produce a field mask of 060001. 
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MUL{<cond>}{<S>} Rd, Rm, Rs 


Description The MUL (Multiply) instruction is used to multiply signed or unsigned variables 
to produce a 32-bit result. 
MUL multiplies the value of register Rm with the value of register Rs, and stores 
the result in the destination register Rd. The condition code flags are optionally 
updated (based on the result). 
The instruction is only executed if the condition specified in the instruction matches 
the condition code status. The conditions are defined in 3.3 The Condition Field on 
page 3-4. 

28 27 26 25 24 23 22 21 20 19 16 15 12 11 


Operation if ConditionPassed(<cond>) then 
Rd = (Rm * Rs) [31:0] 
if S == then 
N Flag = Rd[31] 
Flag = if Rd == 0 then 1 else 0 
Flag = UNPREDICTABLE 
Flag = unaffected 





<QN 


Exceptions None 


Qualifiers Condition Code 
S update condition code flags N,Z 
Notes Use of R15: Specifying R15 for register Rd, Rm or Rs has UNPREDICTABLE results. 


Operand restriction: Specifying the same register for Rd and Rm has 
UNPREDICTABLE results. 


Early termination: If the multiplier implementation supports early termination, 
it must be implemented on the value of the Rs operand. The type of early 
termination used (signed or unsigned) is IMPLEMENTATION DEFINED. 


Signed and unsigned: Because the MUL instruction produces only the lower 
32 bits of the 64-bit product, MUL gives the same answer for multiplication of 
both signed and unsigned numbers. 


ARM Architecture Reference Manual 
ARM DUI 0100B 


Ml POWERED 


ARM 


2 


MVN{<cond>}{S} Rd, <shifter_operand> 


Description The MVN (Move negative) instruction is used to: 
* write a negative value into a register 





¢* form a bit mask 


Addressing mode 1 


* take the one’s complement of a value 


MVN moves the logical one's compliment of the value of <shifter_operand> 
to the destination register Rd, and optionally updates the condition code flags 
(based on the result). 


The instruction is only executed if the condition specified in the instruction matches 
the condition code status. The conditions are defined in 3.3 The Condition Field on 








page 3-4. 
28 27 26 25 24 23 22 21 20 19 16 15 12 11 0 
Operation if ConditionPassed(<cond>) then 
Rd = NOT <shifter_operand> 
if goss and Rd == R15 then 
CPSR = SPSR 
lse if S == 1 then 
N Flag = Rd[31] 
Z Flag = if Rd == 0 then 1 else 0 
C Flag = <shifter_carry_out> 
V Flag = unaffected 
Exceptions None 
Qualifiers Condition Code 
iS) Update condition code flags N,Z,C 
Notes Shifter operand: The shifter operands for this instruction are given in Addressing 


Mode 1 starting on page 3-84. 


Writing to R15: When Rdis R15 and the S flag in the instruction is not set, the result 
of the operation is placed in the PC. When Rd is R15 and the S flag is set, 
the result of the operation is placed in the PC and the SPSR corresponding to 
the current mode is moved to the CPSR. This allows state changes which 
atomically restore both PC and CPSR. This form of the instruction is 
UNPREDICTABLE in User mode and System mode. 


The I bit: Bit 25 is used to distinguish between the immediate and register forms of 
<shifter_operand>. 
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ORR{<cond>}{S} Rd, Rn, <shifter_operand> 


Description The ORR (Logical OR) instruction can be used to set selected bits in a register; 
for each bit OR with 1 will set the bit, OR with 0 will leave it unchanged. 


ORR performs a bitwise (inclusive) OR of the value of register Rn with the value of 
<shifter_operand>, and stores the result in the destination register Rd. 
The condition code flags are optionally updated (based on the result). 


The instruction is only executed if the condition specified in the instruction matches 
the condition code status. The conditions are defined in 3.3 The Condition Field on 





Addressing mode 1 








page 3-4. 

28 27 26 25 24 23 22 21 20 19 1615 12-11 0 
=P t= = | a 
Operation if ConditionPassed(<cond>) then 

Rd = Rn OR <shifter_operand> 
if == 1 and Rd == R15 then 
CPSR = SPSR 
lse if S == 1 then 
N Flag = Rd[31] 
Z Flag = if Rd == 0 then 1 else 0 
C Flag = <shifter_carry_out> 


V Flag = unaffected 
Exceptions None 


Qualifiers Condition Code 
S updates condition code flags N, Z and C 


Notes Shifter operand: The shifter operands for this instruction are given in Addressing 
Mode 1 starting on page 3-84. 


Writing to R15: When Rd is R15 and the S flag in the instruction is not set, 
the result of the operation is placed in the PC. When Rd is R15 and the S flag 
is set, the result of the operation is placed in the PC and the SPSR 
corresponding to the current mode is moved to the CPSR. This allows state 
changes which atomically restore both PC and CPSR. This form of 
the instruction is UNPREDICTABLE in User mode and System mode. 


The | bit: Bit 25 is used to distinguish between the immediate and register forms of 
<shifter_operand>. 


3-60 ARM Architecture Reference Manual 


ARM DUI 0100B 


Ml POWERED 


ARM 


2 


RSB{<cond>}{S} Rd, Rn, <shifter_operand> 


Description The RSB (Reverse Subtract) instruction subtracts the value of register Rn from 
the value of <shifter_operand>, and stores the result in the destination 
register Rd. The condition code flags are optionally updated (based on the result). 
The following instruction stores the negative (two’s complement) of Rx in Rd. Badia esi GOCE G 

RSB Rd, Rx, #0 


Constant multiplication (of Rx) by 21 (into Rd) can be performed with: 
RSB Rd, Rx, Rx LSL #n 





The instruction is only executed if the condition specified in the instruction matches 
the condition code status. The conditions are defined in 3.3 The Condition Field on 





page 3-4. 

28 27 26 25 24 23 22 21 20 19 1615 12 11 0 
Pee = = |] 
Operation if ConditionPassed(<cond>) then 

Rd = <shifter_operand> - Rn 
if S == and Rd == R15 then 
CPSR = SPSR 
lse if S == 1 then 
N Flag = Rd[31] 
Z Flag = if Rd == 0 then 1 else 0 





C Flag = NOT BorrowFrom(<shifter_operand> —- Rn) 
V Flag = OverflowFrom (<shifter_operand> - Rn) 


Exceptions None 


Qualifiers Condition Code 
S updates condition code flags N,Z,C,V 


Notes Shifter operand: The shifter operands for this instruction are given in Addressing 
Mode 1 starting on page 3-84. 


Writing to R15: When Rd is R15 and the S flag in the instruction is not set, 
the result of the operation is placed in the PC. When Rd is R15 and the S flag 
is set, the result of the operation is placed in the PC and the SPSR 
corresponding to the current mode is moved to the CPSR. This allows state 
changes which atomically restore both PC and CPSR. This form of 
the instruction is UNPREDICTABLE in User mode and System mode. 


The I bit: Bit 25 is used to distinguish between the immediate and register forms of 
<shifter_operand>. 
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RSC{<cond>}{S} Rd, Rn, <shifter_operand> 


Description The RSC (Reverse Subtract with Carry) instruction subtracts the value of register 
Rn and the value of NOT (Carry Flag) from the value of <shifter_operand>, 
and stores the result in the destination register Rd. The condition code flags are 
optionally updated (based on the result). 

To negate the 64-bit value in RO, R1, use the following sequence (RO holds 
the least-significant word) and store the result in R2,R3: 
RSBS  R2,R0, #0 
RSC R3,R1, #0 
The instruction is only executed if the condition specified in the instruction matches 
the condition code status. The conditions are defined in 3.3 The Condition Field on 
page 3-4. 
28 27 26 25 24 23 22 21 20 19 1615 12 11 0 


Pm [eer] = | = | siete 


Operation 


if ConditionPassed(<cond>) then 
Rd = <shifter_operand> Rn NOT(C Flag) 











if S == 1 and Rd == R15 then 
CPSR = SPSR 
lse if S == 1 then 
N Flag = Rd[31] 
Z Flag = if Rd == 0 then 1 else 0 


C Flag = NOT BorrowFrom(<shifter_operand> Rn NOT(C Flag) ) 
V Flag = OverflowFrom (<shifter_operand> Rn NOT(C Flag) ) 


Exceptions 


Qualifiers 


Notes 








None 


Condition Code 
S updates condition code flags N,Z,C and V 


Shifter operand: The shifter operands for this instruction are given in Addressing 
Mode 1 starting on page 3-84. 

Writing to R15: When Rd is R15 and the S flag in the instruction is not set, 
the result of the operation is placed in the PC. When Rd is R15 and the S flag 
is set, the result of the operation is placed in the PC and the SPSR 
corresponding to the current mode is moved to the CPSR. This allows state 
changes which atomically restore both PC and CPSR. This form of 
the instruction is UNPREDICTABLE in User mode and System mode. 

The I bit: Bit 25 is used to distinguish between the immediate and register forms 
of <shifter_operand>. 
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SBC{<cond>}{S} Rd, Rn, <shifter_operand> 


Description The SBC (Subtract with Carry) instruction is used to synthesize multi-word 
subtraction. If register pairs RO,R1 and R2,R3 hold 64-bit values (RO and R2 hold 
the least-significant words), the following instructions leave the 64-bit difference in 
R4,R5: 

SUBS R4,R0,R2 
SBC R5,R1,R3 
SBC subtracts the value of <shifter_operand> and the value of NOT (Carry 


Flag) from the value of register Rn, and stores the result in the destination register 
Rd. The condition code flags are optionally updated (based on the result). 


The instruction is only executed if the condition specified in the instruction matches 
the condition code status. The conditions are defined in 3.3 The Condition Field on 


page 3-4. 
28 27 26 25 24 23 22 21 20 19 16 15 12°11 0 
fm Poets) = | il 


Operation 


if ConditionPassed(<cond>) then 
Rd = Rn - <shifter_operand> - NOT(C Flag) 








if S == and Rd == R15 then 
CPSR = SPSR 
lse if S == 1 then 
N Flag = Rd[31] 
Z Flag = if Rd == 0 then 1 else 0 


C Flag = NOT BorrowFrom(Rn — <shifter_operand> - NOT(C Flag) ) 
V Flag = OverflowFrom (Rn - <shifter_operand> -— NOT(C Flag) ) 


Exceptions None 


Qualifiers Condition Code 
S Update condition code flags N,Z,C,V 
Notes Shifter operand: The shifter operands for this instruction are given in Addressing 


Mode 1 starting on page 3-84. 

Writing to R15: When Rd is R15 and the S flag in the instruction is not set, 
the result of the operation is placed in the PC. When Rd is R15 and the S flag 
is set, the result of the operation is placed in the PC and the SPSR 
corresponding to the current mode is moved to the CPSR. This allows state 
changes which atomically restore both PC and CPSR. This form of 
the instruction is UNPREDICTABLE in User mode and System mode. 


The I bit: Bit 25 is used to distinguish between the immediate and register forms 
of <shifter_operand>. 
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Architecture v3 
and v4 only 


3-64 


Description 


SMLAL{<cond>}{<S>} RdLo, RdHi, Rm, Rs 


The SMLAL (Signed Multiply Accumulate Long) instruction multiplies signed 
variables to produce a 64-bit result, which is added to the 64-bit value in the two 
destination general-purpose registers. The result is written back to the two 
destination general-purpose registers. 


SMLAL multiplies the signed value of register Rm with the signed value of register 
Rs to produce a 64-bit result. The lower 32 bits of the result are added to RdLo and 
stored in RdLo; the upper 32 bits, and the carry from the addition to RdLo, are 
added to RdHi and stored in RdHi. The condition code flags are optionally updated 
(based on the 64-bit result). 


The instruction is only executed if the condition specified in the instruction matches 
the condition code status. The conditions are defined in 3.3 The Condition Field on 
page 3-4. 


28.27 26 25 24 23 22 21 20 19 16.15 12,011 


Operation 
if ConditionPassed(<cond>) then 








RdLo = (Rm * Rs) [31:0] + RdlLo 
RdHi = (Rm * Rs) [63:32] + RdHi + CarryFrom((Rm * Rs) [31:0] + RdLo) 
if S == 1 then 
N Flag = RdHi[31] 
Z Flag = if (RdHi == 0) and (RdLo == 0) then 1 else 0 
C Flag = UNPREDICTABLE 
V Flag = UNPREDICTABLE 
Exceptions None 
Qualifiers Condition Code 
S updates condition code flags N,Z 
Notes Use of R15: Specifying R15 for register RdHi, RdLo, Rm or Rs has UNPREDICTABLE 


results. 


Operand restriction: Specifying the same register for RdHi and Rm has 
UNPREDICTABLE results. 
Specifying the same register for RdLo and Rm has UNPREDICTABLE results. 
Specifying the same register for RdHi and RdLo has UNPREDICTABLE results. 


Early termination: If the multiplier implementation supports early termination, 
it must be implemented on the value of the Rs operand. The type of early 
termination used (signed or unsigned) is IMPLEMENTATION DEFINED. 
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SMULL{<cond>}{<S>} RdLo, RdHi, Rm, Rs 


Description The SMULL (Signed Multiply Long) instruction multiplies signed variables 
to produce a 64-bit result in two general-purpose registers. 


SMULL multiplies the signed value of register Rm with the signed value of register 
Rs to produce a 64-bit result. The upper 32 bits of the result are stored in RdHi; 
the lower 32 bits are stored in RdLo. The condition code flags are optionally 
updated (based on the 64-bit result). 
The instruction is only executed if the condition specified in the instruction matches 
the condition code status. The conditions are defined in 3.3 The Condition Field on 
page 3-4. 

28 27 26 25 24 23 22 21 20 19 16 15 12 11 


Operation 
if ConditionPassed(<cond>) then 
RdHi = (Rm * Rs) [63:32] 
RdLo = (Rm * Rs) [31:0] 
if S == 1 then 
N Flag 





RdHi[31] 

Z Flag if (RdHi == 0) and (RdLo == 0) then 1 else 0 
C Flag UNPREDICTABLE 

V Flag = UNPREDICTABLE 





Exceptions None 


Qualifiers Condition Code 
S updates condition code flags N,Z 
Notes Use of R15: Specifying R15 for register RdHi, RdLo, Rm or Rs has UNPREDICTABLE 
results. 


Operand restriction: Specifying the same register for RdHi and Rm has 
UNPREDICTABLE results. 
Specifying the same register for RdLo and Rm has UNPREDICTABLE results. 
Specifying the same register for RdHi and RdLo has UNPREDICTABLE results. 


Early termination: If the multiplier implementation supports early termination, 
it must be implemented on the value of the Rs operand. The type of early 
termination used (signed or unsigned) is IMPLEMENTATION DEFINED. 
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Addressing mode 5 


Not in architecture v1 


3-66 


STC{<cond>} p<cp_num>, CRd, <addressing_mode> 


Description The STC (Store Coprocessor) instruction is useful for storing coprocessor data 
to memory. The N bit could be used to distinguish between a single- and 
double-precision transfer for a floating-point store instruction. 


STC stores data from the coprocessor specified by <cp_num> to the sequence of 
consecutive memory addresses calculated by <addressing_mode>. If no 
coprocessors indicate that they can execute the instruction, an UNDEFINED 
instruction exception is generated. 


The instruction is only executed if the condition specified in the instruction matches 
the condition code status. The conditions are defined in 3.3 The Condition Field on 
page 3-4. 

28 27 26 25 24 23 22 21 20 19 1615 12011 





Operation 
if ConditionPassed(<cond>) then 
<address> = <start_address> 
while (NotFinished (coprocessor [<cp_num>] ) ) 
Memory [<address>,4] = value from Coprocessor[<cp_num>] 
<address> = <address> + 4 
assert <address> == <end_address> 


Operation Undefined Instruction; Data Abort 
Qualifiers Condition Code 


Notes Addressing mode: The P, U and W bits specify the <addressing_mode>. 
See Addressing Mode 5 starting on page 3-123. 


The N bit: This bit is coprocessor-dependent. It can be used to distinguish 
between two sizes of data to transfer. 


Register Rn: Specifies the base register used by <addressing_mode>. 

Coprocessor fields: Only instruction bits[31:23], bits[21:16} and bits[11:0] are 
ARM architecture-defined; the remaining fields (bit[22] and bits[15:12]) are 
only recommendations, for compatibility with ARM Development Systems. 

Data Abort: If a data abort is signalled and <addressing_mode> uses 
pre-indexed or post-indexed addressing, the value left in Rn is 
IMPLEMENTATION DEFINED, but is either the original base register value or 
the updated base register value. 


Non-word-aligned addresses: Store coprocessor register instructions ignore 
the least-significant two bits of <address> (the words are not rotated as for 
load word). 

Alignment: If an implementation includes a System Control Coprocessor 
(see Chapter 7), and alignment checking is enabled, an address with 
bits[1:0] != 0b00 will cause an alignment exception. 
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STM{<cond>}<addressing_mode> Rn{!}, <registers> 


Description The STM (Store Multiple) instruction is useful as a block store instruction 
(combined with load multiple it allows efficient block copy) and for stack 
operations, including procedure entry to save general-purpose registers and the 
return address, and for updating the stack pointer. 


STM stores a non-empty subset (or possibly all) of the general-purpose registers 
to sequential memory locations. The registers are stored in sequence, the 
lowest-numbered register first, to the lowest memory address 
(<start_address>); the highest-numbered register last, to the highest memory 
address (<end_address>). 


The instruction is only executed if the condition specified in the instruction matches 
the condition code status. The conditions are defined in 3.3 The Condition Field on 





page 3-4. 
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Operation if ConditionPassed(<cond>) then 
<address> = <start_address> 
for i= 0to 15 
if <register_list>[i] == 1 
Memory [<address>,4] = Ri 
<address> = <address> + 4 
assert <end_address> == <address> 4 
Exceptions Data Abort 
Qualifiers Condition Code 


! sets the W bit, causing base register update 


Notes Addressing mode: The P, U and W bits distinguish between the different types of 
addressing mode. See Addressing Mode 4 starting on page 3-116. 


Register Rn: Specifies the base register used by <addressing_mode>. 
Use of R15: If register 15 if specified as the base register Rn, the result is 


UNPREDICTABLE. If register 15 is specified in <register_list>, the value 
stored is IMPLEMENTATION DEFINED. 

Operand restrictions: If Rn is specified in <register_list>, and writeback is 
specified, the stored value of Rn is UNPREDICTABLE. 


Data Abort: If a data abort is signalled and <addressing_mode> specifies 
writeback, the value left in Rn is IMPLEMENTATION DEFINED, but is either 
the original base register value or the updated base register value. 
Non-word-aligned addresses: STM instructions ignore the least-significant two bits 
of <address> ( words are not rotated as for load word). 
Alignment: If an implementation includes a System Control Coprocessor 
(See Chapter 7), and alignment checking is enabled, an address with 
bits[1:0] != 0b00 will cause an alignment exception. 
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STM{<cond>}<addressing_mode> Rn{!}, <registers>%* 


Description The STM (Store Multiple) instruction is used to store the user mode registers when 
the processor is in a privileged mode (useful when performing process swaps). 


This form of STM stores a subset (or possibly all) of the user mode general- 
purpose registers (which are also the system mode general-purpose registers) to 
sequential memory locations. The registers are stored in sequence, the lowest- 
numbered register first, to the lowest memory address (<start_addr>); the 
highest-numbered register last, to the highest memory address (<end_addr>). 


The instruction is only executed if the condition specified in the instruction matches 
the condition code status. The conditions are defined in 3.3 The Condition Field on 





page 3-4. 
28 27 26 25 24 23 22 21 20 19 16 15 0 
Operation if ConditionPassed(<cond>) then 
<address> = <start_addr> 
for i= 0 to 15 
if <register_list>[i] == 1 
Memory [<address>,4] = Ri_usr 
<address> = <address> + 4 
assert <end_addr> == <address> 4 
Exceptions Data Abort 
Qualifiers Condition Code 
Notes Addressing mode: The P and W bits distinguish between the different types of 


addressing mode. See Addressing Mode 4 starting on page 3-116. 


Banked registers: This instruction must not be followed by an instruction which 
accesses banked registers (a following NOP is a good way to ensure this). 


Writeback: Setting bit 21 (the W bit) has UNPREDICTABLE results. 
User and System mode: This instruction is UNPREDICTABLE in user or system mode. 
Register Rn: Specifies the base register used by <addressing_mode>. 


Use of R15: If register 15 is specified as the base register Rn, the result is 
UNPREDICTABLE. If register 15 is specified in <register_list> the value 
stored is IMPLEMENTATION DEFINED. 

Base register mode: The base register is read from the current processor mode 
registers, not the user mode registers. 


Data Abort: If a data abort is signalled, the value left in Rn is the original base 
register value. 


Non-word-aligned addresses: Load multiple instructions ignore the least-significant 
two bits of <address> (the words are not rotated as for load word). 


Alignment: If an implementation includes a System Control Coprocessor 
(see Chapter 7), and alignment checking is enabled, an address with 
bits[1:0] != 0b00 causes an alignment exception. 
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STR{<cond>} Rd, <addressing_mode> 


Description Combined with a suitable addressing mode, the STR (Store register) instruction 
stores 32-bit data from a general purpose register into memory. Using the PC as 
the base register allows PC-relative addressing, to facilitate position-independent 
code. Addressing mode 2 





STR stores a word from register Rd to the memory address calculated by 
<addressing_mode>. 


The instruction is only executed if the condition specified in the instruction matches 
the condition code status. The conditions are defined in 3.3 The Condition Field on 


page 3-4. 

28 27 26 25 24 23 22 21 20 19 1615 12-11 0 
me TT] om | | Pere pe —e 
Operation if ConditionPassed(<cond>) then 

Memory [<address>,4] = Rd 


Exceptions Data Abort 
Qualifiers Condition Code 


Notes Addressing modes: The |, P, U and W bits specify the type of 
<addressing_mode> (see Addressing Mode 2 starting on page 3-98). 


Register Rn: Specifies the base register used by <addressing_mode>. 


Use of R15: If register 15 is specified for Rd, the value stored is IMPLEMENTATION 
DEFINED. 


Operand restrictions: If <addressing_mode> uses pre-indexed or post-indexed 
addressing, and the same register is specified for Rd and Rn, the results are 
UNPREDICTABLE. 


Data Abort: If a data abort is signalled and <addressing_mode> uses 
preindexed or post-indexed addressing, the value left in Rn is IMPLEMENTATION 
DEFINED, but is either the original base register value or the updated base 
register value (even if the same register is specified for Rd and Rn). 


Alignment: If an implementation includes a System Control Coprocessor 
(See Chapter 7), and alignment checking is enabled, an address with 
bits[1:0] != 0b00 will cause an alignment exception. 
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Addressing mode 2 


3-70 


STR{<cond>}B Rd, <addressing_mode> 


Description Combined with a suitable addressing mode, the STRB (Store Register Byte) writes 
the least-significant byte of a general-purpose register to memory. 
Using the PC as the base register allows PC-relative addressing, to facilitate 
position-independent code. 
STRB stores a byte from the least-significant byte of register Rd to the memory 
address calculated by <addressing_mode>. 
The instruction is only executed if the condition specified in the instruction matches 
the condition code status. The conditions are defined in 3.3 The Condition Field on 
page 3-4. 

28 27 26 25 24 23 22 21 20 19 16 15 12 11 





em ded po] om | ee oe sis 


Operation 


Exceptions 
Qualifiers 


Notes 


if ConditionPassed(<cond>) then 
Memory [<address>,1] = Rd[7:0] 


Data Abort 

Condition Code 

Addressing modes: The |, P, U and W bits specify the type of 
<addressing_mode> (see Addressing Mode 2 starting on page 3-98). 

Register Rn: Specifies the base register used by <addressing_mode>. 

Use of R15: If register 15 is specified for Rd, the result is UNPREDICTABLE. 


Operand restrictions: If <addressing_mode> uses pre-indexed or post-indexed 
addressing, and the same register is specified for Rd and Rn, the results are 
UNPREDICTABLE. 


Data Abort: If a data abort is signalled and <addressing_mode> uses 
pre-indexed or post-indexed addressing, the value left in Rn is 
IMPLEMENTATION DEFINED, but is either the original base register value or 
the updated base register value (even if the same register is specified for Rd 
and Rn). 
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STR{<cond>}BT Rd, <post_indexed_addressing_mode> 


Description The STRBT (Store Register Byte with Translation) instruction can be used by 
a (privileged) exception handler that is emulating a memory access instruction 
which would normally execute in User Mode. The access is restricted as if it has 
User Mode privilege. 


STRBT stores a byte from the least-significant byte of register Rd to the memory 
address calculated by <post_indexed_addressing_mode>. If the instruction 
is executed when the processor is in a privileged mode, the memory system is 
signalled to treat the access as if the processor were in user mode. 


The instruction is only executed if the condition specified in the instruction matches 
the condition code status. The conditions are defined in 3.3 The Condition Field on 


page 3-4. 
28 27 26 25 24 23 22 21 20 19 1615 12°11 0 
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Operation if ConditionPassed(<cond>) then 


Memory[<address>,1] = Rd[7:0] 
Exceptions Data Abort 
Qualifiers Condition Code 
Notes Addressing modes: The |, P, and U bits specify the type of <addressing_mode> 


(see Addressing Mode 2 starting on page 3-98). 

Register Rn: Specifies the base register used by 
<post_indexed_addressing_mode>. 

User mode: If this instruction is executed in user mode, an ordinary user mode 
access is performed. 

Use of R15: If register 15 is specified for Rd, the result is UNPREDICTABLE. 

Operand restrictions: If the same register is specified for Rd and Rn, the results are 
UNPREDICTABLE. 


Data Abort: If a data abort is signalled, the value left in Rn is IMPLEMENTATION 
DEFINED, but is either the original base register value or the updated base 
register value (even if the same register is specified for Rd and Rn). 
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Addressing mode 3 


Architecture v4 only 
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STR{<cond>}H Rd, <addressing_mode> 


Description Combined with a suitable addressing mode, the STRH (Store Register Halfword) 
instruction allows 16-bit data from a general-purpose register to be stored to 
memory. Using the PC as the base register allows PC-relative addressing, 
to facilitate position-independent code. 


STBRH stores a halfword from the least-significant halfword of register Rd to 
the memory address calculated by <addressing_mode>. If the address is not 
halfword-aligned, the result is UNPREDICTABLE. 


The instruction is only executed if the condition specified in the instruction matches 
the condition code status. The conditions are defined in 3.3 The Condition Field on 
page 3-4. 

28 27 26 25 24 23 22 21 20 19 1615 12011 





Operation if ConditionPassed(<cond>) then 
if <address>[0] == 
<data> = Rd[15:0] 


else /* <address>[0] == 1 */ 
<data> = UNPREDICTABLE 
Memory [<address>,2] = <data> 


Exceptions Data Abort 
Qualifiers Condition Code 
Notes Addressing modes: The |, P, U and W bits specify the type of 
<addressing_mode> (see Addressing Mode 3 starting on page 3-109). 
The addr_mode bits: These bits are addressing-mode specific. 
Register Rn: Specifies the base register used by <addressing_mode>. 
Use of R15: If register 15 is specified for Rd, the result is UNPREDICTABLE. 


Operand restrictions: If <addressing_mode> uses pre-indexed or post-indexed 
addressing, and the same register is specified for Rd and Rn, the results are 
UNPREDICTABLE. 


Data Abort: If a data abort is signalled and <addressing_mode> uses 
pre-indexed or post-indexed addressing, the value left in Rn is 
IMPLEMENTATION DEFINED, but is either the original base register value or 
the updated base register value (even if the same register is specified for Rd 
and Rn). 


Non-half-word aligned addresses: If the store address is not halfword-aligned, 
the stored value is UNPREDICTABLE. 


Alignment: If an implementation includes a System Control Coprocessor 
(see Chapter 7), and alignment checking is enabled, an address with 
bit[0] != 0 will cause an alignment exception. 
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STR{<cond>}T Rd, <post_indexed_addressing_mode> 


Description The STRT (Store Register with Translation) instruction can be used by 
a (privileged) exception handler that is emulating a memory access instruction that 
would normally execute in User Mode. The access is restricted as if it has 
User Mode privilege. Addressing mode 2 


STRT stores a word from register Rd to the memory address calculated by 
<post_indexed_addressing_mode>. If the instruction is executed when 
the processor is in a privileged mode, the memory system is signalled to treat 
the access as if the processor was in user mode. 


The instruction is only executed if the condition specified in the instruction matches 
the condition code status. The conditions are defined in 3.3 The Condition Field on 





page 3-4. 

28 27 26 25 24 23 22 21 20 19 1615 12°11 0 
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Operation if ConditionPassed(<cond>) then 

Memory [<address>,4] = Rd 


Exceptions Data Abort 

Qualifiers Condition Code 

Notes Addressing modes: The |, P, and U bits specify the type of <addressing_mode> 
(see Addressing Mode 2 starting on page 3-98). 


Register Rn: Specifies the base register used by 
<post_indexed_addressing_mode>. 

User mode: If this instruction is executed in user mode, an ordinary user mode 
access is performed. 

Use of R15: If register 15 is specified for Rd, the value stored is IMPLEMENTATION 
DEFINED. 

Operand restrictions: If the same register is specified for Rd and Rn, the results are 
UNPREDICTABLE. 

Data Abort: If a data abort is signalled, the value left in Rn is IMPLEMENTATION 
DEFINED, but is either the original base register value or the updated base 
register value (even if the same register is specified for Rd and Rn). 

Alignment: If an implementation includes a System Control Coprocessor 
(See Chapter 7), and alignment checking is enabled, an address with 
bits[1:0] != 0b00 will cause an alignment exception. 
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Addressing mode 1 


SUB{<cond>}{S} Rd, Rn, <shifter_operand> 


Description The SUB (Subtract) instruction is used to subtract one value from another 
to produce a third. To decrement a register value (in Rx) use: 
SUB Rx, Rx, #1 
SUB subtracts the value of <shifter_operand> from the value of register Rn, 
and stores the result in the destination register Rd. The condition code flags are 
optionally updated (based on the result). 
SUBS is useful as a loop counter decrement, as the loop branch can test the flags 
for the appropriate termination condition, without the need fora CMP Rx, #0. 
Use SUBS PC, LR, #4 to return from an interrupt. 
The instruction is only executed if the condition specified in the instruction matches 
the condition code status. The conditions are defined in 3.3 The Condition Field on 
page 3-4. 
2827 26 25 24 23 22 21 20 19 1615 12011 0 


Pm eaifecros| ™ | w pleases 


Operation 


Exceptions 


Qualifiers 


Notes 


if ConditionPassed(<cond>) then 
Rd = Rn - <shifter_operand> 








if S == 1 and Rd == R15 then 
CPSR = SPSR 
lse if S == 1 then 
N Flag = Rd[31] 
Z Flag = if Rd == 0 then 1 else 0 
C Flag = NOT BorrowFrom(Rn — <shifter_operand>) 
V Flag = OverflowFrom (Rn -— <shifter_operand>) 


None 


Condition Code 
S updates condition code flags N,Z,C and V 


Shifter operand: The shifter operands for this instruction are given in Addressing 
Mode 1 starting on page 3-84. 


Writing to R15: When Rd is R15 and the S flag in the instruction is not set, 
the result of the operation is placed in the PC. When Rd is R15 and the S flag 
is set, the result of the operation is placed in the PC and the SPSR 
corresponding to the current mode is moved to the CPSR. This allows state 
changes which atomically restore both PC and CPSR. This form of 
the instruction is UNPREDICTABLE in User mode and System mode. 


The | bit: Bit 25 is used to distinguish between the immediate and register forms of 
<shifter_operand>. 
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SWI{<cond>} <24 bit _immediate> 


Description The SWI instruction causes a SWI exception, see 2.5 Exceptions on page 2-6. 


The SWI instruction is used as an operating system service call. It can be used in 
two ways: 
* — to use the 24-bit immediate value to indicate the OS service that is 
required 
* — to ignore the 24-bit field and indicate the service required with 
a general-purpose register 





A SWI exception is generated, which is handled by an operating system to provide 
the requested service. 


The instruction is only executed if the condition specified in the instruction matches 
the condition code status. The conditions are defined in 3.3 The Condition Field on 


page 3-4. 
31 28 27 26 25 24 23 0 
} oma | ; las (| 24_bit_immediate 
Operation if ConditionPassed(<cond>) then 
R14_sve = address of SWI instruction + 4 


SPSR_svc = CPSR 
CPSR[5:0] = 0b010011; enter Supervisor mode 


CPSR[7] = 1; disable IRQ 
PC = 0x08 
Exceptions None 
Qualifiers Condition Code 
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Not in architecture 
v1 or v2 
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SWP{<cond>} Rd, Rm, [Rn] 


Description The SWP (Swap) instruction swaps a word between registers and memory. 


SWP loads a word from the memory address given by the value of register Rn. 
The value of register Rm is then stored to the memory address given by the value 
of Rn, and the original loaded value is written to register Rd. If the same register is 
specified for Rd and Rn, this instruction swaps the value of the register and 

the value at the memory address. 


The instruction is only executed if the condition specified in the instruction matches 
the condition code status. The conditions are defined in 3.3 The Condition Field on 


page 3-4. 
28 27 26 25 24 23 22 21 20 19 1615 12-11 
Operation if ConditionPassed(<cond>) then 


<temp> = Memory([Rn, 4] 

Memory[Rn,4] = Rm 

Rd = <temp> 
Exceptions Data Abort 
Qualifiers Condition Code 
Notes Non-word-aligned addresses: If the address is not word-aligned, the loaded value 

is rotated right by 8 times the value of <address>[1:0]. 
Use of R15: If register 15 is specified for Rd, Rn or Rm, the result is UNPREDICTABLE. 


Operand restrictions: If the same register is specified as Rn and Rm, or Rn and 
Rd, the result is UNPREDICTABLE. 


Data Abort: If a data abort is signalled on either the load access or the store access 
(or both), the loaded value is not written to Rd. 


Alignment: If an implementation includes a System Control Coprocessor 
(see Chapter 7), and alignment checking is enabled, an address with 
bits[1:0] != 0b00 will cause an alignment exception. 
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SWP{<cond>}B Rd, Rm, [Rn] 


Description The SWPB (Swap Byte) instruction swaps a byte between registers and memory. 


SWPB loads a byte from the memory address given by the value of register Rn. 
The value of the least-significant byte of register Rm is stored to the memory 
address given by Rn, and the original loaded value is zero-extended to a 32-bit 
word, and the word is written to register Rd. If the same register is specified for Rd 
and Rn, this instruction swaps the value of the least-significant byte of the register 





Not in architecture 


and the byte value at the memory address. ee 
The instruction is only executed if the condition specified in the instruction matches 
the condition code status. The conditions are defined in 3.3 The Condition Field on 
page 3-4. 
28 27 26 25 24 23 22 21 20 19 16 15 12 11 
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Operation if ConditionPassed(<cond>) then 
<temp> = Memory[Rn,1] 
Memory[Rn,1] = Rm[7:0] 
Rd = <temp> 
Exceptions Data Abort 
Qualifiers Condition Code 
Notes Use of R15: If register 15 is specified for Rd, Rn or Rm, the result is UNPREDICTABLE. 
Operand restrictions: If the same register is specified as Rn and Rm, Rn and or Rd, 
the result is UNPREDICTABLE. 
Data Abort: If a data abort is signalled on either the load access or the store 
access (or both), the loaded value is not written to Rd. 
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Addressing mode 1 
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Description 


TEQ{<cond>} Rn, <shifter_operand> 


The TEQ (Test equivalence) instruction is used to test if two values are equal, 
without affecting the V flag (as CMP does). TEQ is a Iso useful for testing if two 
values have the same sign. 


The comparison is the Logical Exclusive OR of the two operands. 


TEQ performs a comparison by logically Exclusive ORing the value of register Rn 
with the value of <shofter_operand>, and updates the condition code flags 
(based on the result). 


The instruction is only executed if the condition specified in the instruction matches 
the condition code status. The conditions are defined in 3.3 The Condition Field on 
page 3-4. 


28 27 26 25 24 23 22 21 20 19 16.15 12,011 0 
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Operation 


Exceptions 
Qualifiers 


Notes 


if ConditionPassed(<cond>) then 








<alu_out> = Rn EOR <shifter_operand> 

N Flag = <alu_out>[31] 

Z Flag = if <alu_out> == 0 then 1 else 0 
C Flag = <shifter_carry_out> 

V Flag = unaffected 


None 
Condition Code 


Shifter operand: The shifter operands for this instruction are given in Addressing 
Mode 1 starting on page 3-84. 


The | bit: Bit 25 is used to distinguish between the immediate and register forms of 
<shifter_operand>. 
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TST{<cond>} Rn, <shifter_operand> 


Description The TST (Test) instruction is used to determine if many bits of a register are all 
clear, or if at least one bit of a register is set. The comparison is a logical AND of 
the two operands. 


TST performs a comparison by logically ANDing the value of register Rn with 


the value of <shifter_operand>, and updates the condition code flags (based 
on the result). 





Addressing mode 1 


The instruction is only executed if the condition specified in the instruction matches 
the condition code status. The conditions are defined in 3.3 The Condition Field on 


page 3-4. 
28 27 26 25 24 23 22 21 20 19 1615 12°11 0 
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Operation if ConditionPassed(<cond>) then 


<alu_out> = Rn AND <shifter_operand> 

N Flag = <alu_out>[31] 

Z Flag = if <alu_out> == 0 then 1 else 0 
C Flag <shifter_carry_out> 

V Flag = unaffected 





Exceptions None 
Qualifiers Condition Code 


Notes Shifter operand: The shifter operands for this instruction are given in Addressing 
Mode 1 starting on page 3-84. 


The I bit: Bit 25 is used to distinguish between the immediate and register forms of 
<shifter_operand>. 
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and v4 only 
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Description 


UMLAL{<cond>}{<S>} RdLo, RdHi, Rm, Rs 


The UMLAL (Unsigned Multiply Accumulate Long) instruction multiplies unsigned 
variables to produce a 64-bit result, which is added to the 64-bit value in the two 
destination general-purpose registers. The result is written back to the two 
destination general-purpose registers. 


UMLAL multiplies the unsigned value of register Rm with the unsigned value of 
register Rs to produce a 64-bit result. The lower 32 bits of the result are added 
to RdLo and stored in RdLo; the upper 32 bits and the carry from the addition 
to RdLo are added to RdHi and stored in RdHi. The condition code flags are 
optionally updated (based on the 64-bit result). 


The instruction is only executed if the condition specified in the instruction matches 
the condition code status. The conditions are defined in 3.3 The Condition Field on 
page 3-4. 


28 27 26 25 24 23 22 21 20 19 16.15 12,11 


Operation 





if ConditionPassed(<cond>) then 





RdLo = (Rm * Rs) [31:0] + RdlLo 
RdHi = (Rm * Rs) [63:32] + RdHi + CarryFrom((Rm * Rs) [31:0] + RdLo) 
if S == 1 then 
N Flag = RdHi[31] 
Z Flag = if (RdHi == 0) and (RdLo == 0) then 1 else 0 
C Flag = UNPREDICTABLE 
V Flag = UNPREDICTABLE 
Exceptions None 
Qualifiers Condition Code 
S updates condition code flags N and Z 
Notes Use of R15: Specifying R15 for register RdHi, RdLo, Rm or Rs has UNPREDICTABLE 


results. 
Operand restriction: Specifying the same register for RdHi and Rm has 
UNPREDICTABLE results. 
Specifying the same register for RdLo and Rm has UNPREDICTABLE results. 
Specifying the same register for RdHi and RdLo has UNPREDICTABLE results. 
Early termination: If the multiplier implementation supports early termination, 
it must be implemented on the value of the Rs operand. The type of early 
termination used (signed or unsigned) is IMPLEMENTATION DEFINED. 
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UMULL{<cond>}{<S>} RdLo, RdHi, Rm, Rs 


Description The UMULL (Unsigned Multiply Long) instruction multiplies unsigned variables to 
produce a 64-bit result in two general-purpose registers. 





UMULL multiplies the unsigned value of register Rm with the unsigned value of 
register Rs to produce a 64-bit result. The upper 32 bits of the result are stored in 
RdHi; the lower 32 bits are stored in RdLo. The condition code flags are optionally 








updated (based on the 64-bit result). a sees. ee 
The instruction is only executed if the condition specified in the instruction matches 
the condition code status.See 3.3 The Condition Field on poe 3-4. 
28 27 26 25 24 23 22 21 20 19 16 15 12 11 
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Operation 
if ConditionPassed(<cond>) then 
RdHi = (Rm * Rs) [63:32] 
RdLo = (Rm * Rs) [31:0] 
if S == 1 then 
N Flag = RdHi[31] 
Z Flag = if (RdHi == 0) and (RdLo == 0) then 1 else 0 
C Flag = UNPREDICTABLE 
V Flag = UNPREDICTABLE 
Exceptions None 
Qualifiers Condition Code 
S updates condition code flags N and Z 
Notes Use of R15: Specifying R15 for register RdHi, RdLo, Rm or Rs has UNPREDICTABLE 
results. 
Operand restriction: Specifying the same register for RGHi and Rm has 
UNPREDICTABLE results. 
Specifying the same register for RdLo and Rm has UNPREDICTABLE results. 
Specifying the same register for RdHi and RdLo has UNPREDICTABLE results. 
Early termination: If the multiplier implementation supports early termination, 
it must be implemented on the value of the Rs operand. The type of early 
termination used (signed or unsigned) is IMPLEMENTATION DEFINED. 
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Mode 1 
Mode 2 
Mode 3 
Mode 4 
Mode 5 


ARM 


Addressing mode 


Shifter operands for data-processing instructions 
Load and store word or unsigned byte 

Load and store halfword or load signed byte 
Load and store multiple 

Load and store coprocessor 


Function 
short description of the addressing mode 


Architecture availability 


not all addressing modes are available in all 
versions of the ARM architecture 


Encoding 


specifies the bit patterns for the addressing mode 


Operation 


describes the operation of 
the addressing mode in pseudo-code 


Qualifiers and flag settings 


lists any conditions and flag settings 
that apply to the addressing mode 


User notes 
gives notes on using the addressing mode 
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ARM Addressing Modes 





















































——_j| Ss Qualifiers 


Notes 








Description Syntax 
= Addressing y 
Ss | [Rn], +/- Rm 
~*~ peli Description If the condition specified in the instruction matches the co 
| Architecture v4 only register Rm is added to or subtracted from the value of the | 
———* register Rn. 
31 227 wwe 1615 2 
> cond 0 0 0 1 U 0 1 L Rn Rd 
Operation <address> = Rn 
if ConditionPassed(<cond>) then 
if U == 1 then 
a 
Rn = Rn + Rm 
else /* U == 0 */ 
Rn = Rn - Rm 


None 

The L bit: This bit distinguishes between a Load (L==1) 
The S bit: This bit distinguishes between a signed (S== 
The H bit: This bit distinguishes between a halfword (H==1 
Use of R15: Specifying R15 as register Rm or Rn has UNPF 


an 
yi 








= Addressing 
=<  Mode1 
































General encoding 3.16 Data-processing Operands 
<opcode>{<cond>}{S}{Rd}, {Rn}, <shifter_operand> 
32-bit immediate 
31 28 27 26 25 24 21 20 19 16 15 12 11 8.7 0 
cond 0 Oj 1 opcode Ss Rn Rd rotate_imm 
Immediate shifts 
31 28 27 26 25 24 21 20 19 16 15 12 11 7 6 5 4 3 0 
cond 0 0/0 opcode Ss Rn Rd Rs shift | 0 Rm 
Register shifts 
31 28 27 26 25 24 21.20 19 1615 12011 B27 260 BS 4233 0 
cond 0 0/0 opcode Ss Rn Rd Rs 0 | shift Rm 
Description <opcode> Describes the operation of the instruction 
S bit Indicates that the instruction updates the condition codes. 
Rd Specifies the destination register 
Rn Specifies the first source operand register. 
Bits[11:0] The fields within bits[11:0] are collectively called 
a <shifter_operand>. This is described below. 
Bit 25 Is referred to as the | bit, and is used to distinguish between 
an immediate <shifter_operand> and a register-based 
<shifter_operand>. 
3.16.1. The shifter operand 
As well as producing <shifter_operand>, the shifter produces a carry-out 
which some instructions write into the Carry Flag. 
The shifter operand takes one of 3 basic formats: 
*« Immediate operand value 
* Register operand value 
* Shifted register operand value 
Format 1: Immediate operand value 
An immediate operand value is formed by rotating an 8-bit constant (in a 32-bit 
word) by an even number of bits (0,2,4,8...26,28,30). Thus, each instruction 
contains an 8-bit constant and a 4-bit rotate to be applied to that constant. 
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Valid constants are: Shifter operands 


Oxff£,0x104, Oxff0, Oxf£00, Oxf£000, Oxf£000000, Ox£000000£ 


Invalid constants are: 
Ox101,0x102, Oxffl1, Oxff£04, Oxff003, OxfffffFff, OxfOOO0001F 





For example: 

MOV RO, #0 ; Move zero to RO 

ADD R3, R3, #1 ; Add one to the value of register 3 
CMP R7, #1000 ; Compare value of R7 with 1000 


BIC R9, R8, #0xff00 ; Clear bits 8-15 of R8 and store in R9 


Format 2: Register operand value 


A register operand value is simply the value of a register. The value of the register 
is used directly as the operand to the data-processing instruction. 


For example: 
MOV R2, RO ; Move the value of RO to R2 
ADD R4, R3, R2 ; Add R2 to R3, store result in R4 
CMP R7, R8& ; Compare the value of R7 and R8 


Format 3: Shifted register operand value 


A shifted register operand value is the value of a register, shifted (or rotated) 
before it is used as the data-processing operand. There are five types of shift: 


ASR Arithmetic shift right 
LSL Logical shift left 

LSR Logical shift right 

ROR Rotate right 

RRX Rotate right with extend 


The number of bits to shift by is specified either as an immediate or as the value 
of the register. 


For example: 
MOV R2, RO LSL #2 ; Shift RO left by 2, store in R2 

; (R2=R0x4) 
ADD R9, R5, R5 LSL #3 ; RO = R5 + RS x 8 or RO = RS x 9 
RSB R9, R5, R5 LSL #3 ; R9 = RS x 8 - RS or RY = R5 x 7 
SUB R10, R9, R8 LSR #4 ; R1O = RO - R8 / 16 
MOV R12, R4 ROR R3 ; R12 = R4 rotated right by value of R3 


The default shifter operand 


The default register operand (register Rm specified with no shift) uses the form 
register shift left by immediate, with the immediate set to zero. 
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Shifter operands 


3.16.2 Shifter Operands 


The 11 types of <shifter_operand> are described on the following pages: 


Immediate 

Register 

Logical shift left by immediate 
Logical shift left by register 
Logical shift right by immediate 
Logical shift right by register 
Arithmetic shift right by immediate 
Arithmetic shift right by register 
Rotate right by immediate 
Rotate right by register 

Rotate right with extend 
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Description The <shifter_operand> value is formed by rotating (to the right) an 8-bit Immediate 
immediate value to any even bit position in a 32-bit word. If the rotate immediate 
is zero, the carry-out from the shifter is the value of the C flag, otherwise, it is set 
to bit 31 of the value of <shifter_operand>. 


This data-processing operand provides a constant (defined in the instruction) 
operand to a data-processing instruction. 








28 27 26 25 24 21 20 19 16 15 12 11 
Operation <shifter_operand> = <8_bit_immediate> Rotate_Right 
(<rotate_imm> * 2) 
if <rotate_imm> == 0 then 
<shifter_carry_out> = C flag 
else /* <rotate_imm> > 31 */ 
<shifter_carry_out> = <shifter_operand> [31] 
Notes Legitimate immediates: Not all 32-bit immediates are legitimate; only those that 


can be formed by rotating an 8-bit immediate right by an even amount are 
valid 32-bit immediates for this format. 


Alternative assembly specification: The 32-bit immediate can also be specified by: 


#<8_bit_immediate>, <rotate_amount> 


where: 
<rotate_amount> = <rotate_imm> << 1 
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Register Description This data-processing operand provides the value of a register directly. 


This is an instruction operand produced by the value of register Rm. The carry-out 
from the shifter is the C flag. 


28 27 26 25 24 21 20 19 16 15 12,1110 9 
Operation <shifter_operand> = 
Pe eine etnies = Me Flag 
Notes Encoding: This instruction is encoded as a Logical shift left by immediate 
(see page 3-89) with a shift of zero (<shift_imm> == 0). 


Use of R15: If R15 is specified as register Rm or Rn, the value used is the address 
of the current instruction plus 8. 
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Rm, LSL #<shift_imm> 


Description This data-processing operand is used to provide either the value of a register 

directly (lone register operand (see page 3-88), or the value of a register shifted 
left (multiplied by a constant power of two). 
This instruction operand is produced by the value of register Rm, logically shifted 
left by an immediate value in the range 0 to 31. Zeros are inserted into the vacated 
bit positions. The carry-out from the shifter is the last bit shifted out, or the C flag if 
no shift is specified (lone register operand, see page 3-88). 


2827 26 25 24 21, 20-19 16.15 12041 


Operation if <shift_imm> == 0 then /* Register Operand */ 
<shifter_operand> = Rm 
<shifter_carry_out> = C Flag 
else /* <shift_imm> > 0 */ 
<shifter_operand> = Rm Logical_Shift_Left <shift_imm> 
<shifter_carry_out> = Rm[32 - <shift_imm>] 





Notes Default shift: If the value of <shift_imm> == 0, the operand may be written as 
just Rm, (see page 3-88). 


Use of R15: If R15 is specified as register Rm or Rn, the value used is the address 
of the current instruction plus 8. 


ARM Architecture Reference Manual 
ARM DUI 0100B 


Ml POWERED 


ARM 


Z 


wed 2 > 
Mode 1 


Logical shift left 
by immediate 





3-89 
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i fove (=m | Rm, LSL Rs 
<__— shift left ar ; , , F : rer 
by register Description This data-processing operand is used to provide the value of a register multiplied 

by a variable (in a register) power of two. 

It is produced by the value of register Rm, logically shifted left by the value in 

the least-significant byte of register Rs. Zeros are inserted into the vacated bit 

positions. 

28 27 26 25 24 21 20 19 16 15 12 11 
Operation if Rs[ == 0 then 

pe geen = Rm 
<shifter_carry_out> = C Flag 

else if Rs[7:0] < 32 then 
<shifter_operand> = Rm Logical_Shift_Left Rs[7:0] 
<shifter_carry_out> = Rm[32 - Rs[7:0]] 

else if Rs[7:0] == 32 then 
<shifter_operand> = 0 
<shifter_carry_out> = Rm[0] 

else /* Rs[7:0] > 32 */ 
<shifter_operand> = 0 
<shifter_carry_out> = 0 

Notes Use of R15: Specifying R15 as register Rm, register Rn or register Rs has 
UNPREDICTABLE results. 
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Rm, LSR #<shift_imm> 


Description This data-processing operand is used to provide the unsigned value of a register 
shifted right (divided by a constant power of two). 


It is produced by the value of register Rm logically shifted right by an immediate 
value in the range 1 to 32. Zeros are inserted into the vacated bit positions. A shift 
by 32 is encoded by <shift_imm> = 0. 





2827 26 25 24 21 20 19 16 15 1241 
Operation if <shift_imm> == 0 then 


<shifter_operand> = 0 
<shifter_carry_out> = 

else /* <shift_imm> > 0 */ 
<shifter_operand> = Rm Logical_Shift_Right <shift_imm> 
<shifter_carry_out> = Rm[<shift_imm> - 1] 


m[31] 


Notes Use of R15: If R15 is specified as register Rm or Rn, the value used is the address 
of the current instruction plus 8. 
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i fove (=m | Rm, LSR Rs 
reo Description This data-processing operand is used to provide the unsigned value of a register 


shifted right (divided by a variable power of two (in a register)). 
It is produced by the value of register Rm logically shifted right by the value in 


the least-significant byte of register Rs. Zeros are inserted into the vacated bit 
positions. 


28 27 26 25 24 21. 20 19 16.15 12,011 


Operation if Rs[ == 0 then 
pe geen = Rm 
<shifter_carry_out> = C Flag 
else if Rs[7:0] < 32 then 
<shifter_operand> = Rm Logical_Shift_Right Rs[7:0] 











<shifter_carry_out> = Rm[Rs[7:0] - 1] 

else if Rs[7:0] == 32 then 
<shifter_operand> = 0 
<shifter_carry_out> = Rm[31] 

else /* Rs[7:0] > 32 */ 
<shifter_operand> = 0 
<shifter_carry_out> = 0 

Notes Use of R15: Specifying R15 as register Rm, register Rn or register Rs has 
UNPREDICTABLE results. 
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Rm, ASR #<shift_imm> Mode 1 





Arithmetic shift -= 


Description This data-processing operand is used to provide the signed value of a register by immediate 


arithmetically shifted right (divided by a constant power of two). 

It is produced by the value of register Rm arithmetically shifted right by 

an immediate value in the range 1 to 32. The sign bit of Rm (Rm[31]) is inserted 
into the vacated bit positions. A shift by 32 is encoded by <shift_imm> = 0. 


28 27 26 25 24 21.20 19 16.15 12 Al 


Operation if <shift_imm> == 0 then 
if Rm[31] == 0 then 
<shifter_operand> = 
<shifter_carry_out> 
else /* Rm[31] == 1 */ 
<shifter_operand> = Oxffffffff 
<shifter_carry_out> = Rm[31] 
else /* <shift_imm> > 0 */ 
<shifter_operand> = Rm Arithmetic_Shift_Right 
<shift_imm> 
<shifter_carry_out> = Rm[<shift_imm> - 1] 





ie) 


m[31] 


Notes Use of R15: If R15 is specified as register Rm or Rn, the value used is the address 
of the current instruction plus 8. 
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“antetesn Description This data-processing operand is used to provide the signed value of a register 
arithmetically shifted right (divided by a variable power of two (in a register)). 


It is produced by the value of register Rm arithmetically shifted right by the value 
in the least-significant byte of register Rs. The sign bit of Rm (Rm[31]) is inserted 
into the vacated bit positions. 


28 27 26 25 24 21.20 19 16.15 12,011 


Operation if Rs[ == 0 then 
pe geen = Rm 
<shifter_carry_out> = C Flag 
else if Rs[7:0] < 32 then 
<shifter_operand> = Rm Arithmetic_Shift_Right Rs[7:0] 


<shifter_carry_out> = Rm[Rs[7:0] - 1] 
else /* Rs[7:0] >= 32 */ 

if Rm[31] == 0 then 
<shifter_operand> = 0 
<shifter_carry_out> = Rm[31] 

else /* Rm[31] == 1 */ 
<shifter_operand> = Oxffffffff 
<shifter_carry_out> = Rm[31] 

Notes Use of R15: Specifying R15 as register Rm, register Rn or register Rs has 


UNPREDICTABLE results. 
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Rm, ROR #<shift_imm> 


Description This data-processing operand is used to provide the value of a register rotated by 
a constant value. 


An instruction operand produced by the value of register Rm rotated right by 
an immediate value in the range 1 to 31. As bits are rotated off the right end, they 
are inserted into the vacated bit positions on the left. 


When <shift_imm> = 0, a Rotate right with extend operation is performed; 
see page 3-97. 


28.27 26 25 24 21.20 19 16.15 12°11 





Operation if <shift_imm> == 0 then 
See Section , Rm, RRX, on page 3-97 
else /* <shift_imm> > 0 */ 
<shifter_operand> = Rm Rotate_Right <shift_imm> 
<shifter_carry_out> = Rm[<shift_imm> - 1] 


Notes Use of R15: If R15 is specified as register Rm or Rn, the value used is the address 
of the current instruction plus 8. 


ARM Architecture Reference Manual 
ARM DUI 0100B 


Ml POWERED 


ARM 


Z 


a a > 
Mode 1 


Rotate right 
by immediate 





3-95 


= role 
i fove (=m | Rm, ROR Rs 


Rotate right 


by register Description This data-processing operand is used to provide the value of a register rotated by 


a variable value (in a register). 

It is produced by the value of register Rm rotated right by the value in the 
least-significant byte of register Rs. As bits are rotated off the right end, they are 
inserted into the vacated bit positions on the left. 


28.27 26 25 24 21.20 19 16.15 12,~11 


Operation if Rs[ == 0 then 

pe geen = Rm 
<shifter_carry_out> = C Flag 

else if Rs[4:0] == 0 then 
<shifter_operand> = Rm 
<shifter_carry_out> = Rm[31] 

else /* Rs[4:0] > 0 */ 
<shifter_operand> = Rm Rotate_Right Rs[4:0] 
<shifter_carry_out> = Rm[Rs[4:0] - 1] 





Notes Use of R15: Specifying R15 as register Rm, register Rn or register Rs has 
UNPREDICTABLE results. 
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Rm, RRX Mode 1 
Description This data-processing operand can be used to perform a 33-bit rotate right using aenee 
the Carry Flag as the 33rd bit. 


It is produced by the value of register Rm shifted right by one bit, with 
the Carry Flag replacing the vacated bit position. 





2827 26 25 24 212019 1615 12.11.10 9 0 
Operation <shifter_operand> = (C Flag Logical_Shift_Left 31) OR 


(Rm Logical_Shift_Right 1) 
<shifter_carry_out> = Rm[0] 


Notes Encoding: The instruction encoding is in the space that would be used for 
ROR #0. 


Use of R15: If R15 is specified as register Rm or Rn, the value used is the address 
of the current instruction plus 8. 


ADC instruction: A rotate right with extend can be performed with an ADC 
instruction. 
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General encoding 3.17 Load and Store Word or Unsigned Byte Addressing Modes 


There are nine addressing modes used to calculate the address for a load and 
store word or unsigned byte instruction. Each addressing mode is described in 
detail on the following pages. 


Immediate offset page 3-100 
LDR|STR{<cond>}{B} Rd, [Rn, #+/-<12_bit_offset>] 

Register offset page 3-101 
LDR|STR{<cond>}{B} Rd, [Rn, +/-Rm] 

Scaled register offsets page 3-102 
LDR|STR{<cond>}{B} Rd, [Rn, +/-Rm, <shift> #<shift_imm>] 
Immediate pre-indexed page 3-103 
LDR|STR{<cond>}{B} Rd, [Rn, #+/-<12_bit_offset>]! 

Register pre-indexed page 3-104 
LDR|STR{<cond>}{B} Rd, [Rn, +/-Rm] ! 

Scaled register pre-indexed page 3-105 
LDR|STR{<cond>}{B} Rd, [Rn, +/-Rm, <shift> #<shift_imm>] ! 








Immediate post-indexed page 3-106 
LDR|STR{<cond>}{B}{T}Rd, [Rn], #+/-<12_bit_offset> 

Register post-indexed page 3-107 
LDR|STR{<cond>}{B}{T}Rd, [Rn], +/-Rm 

Scaled register post-indexed page 3-108 
LDR|STR{<cond>}{B}{T}Rd, [Rn], +/-Rm, <shift> #<shift_imm> 





iat! offset/index 
28 27 26 25 24 23 22 21 20 19 16 15 12 =11 0 


| coms fo sfofrfufelwie] om | one ' aad 





Regie et offset/index 
28 27 26 25 24 23 22 21 20 19 16 15 12,1110 9 


scaled register offset/index 
28 27 26 25 24 23 22 21 20 19 16 15 12 11 


om yep) =) | mm [a] im 
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Notes The P bit: Pre/Post indexing: General encoding 


Pre-indexing (P==1) indicates the offset is applied to the base register, and 
the result is used as the address. 

Post-indexing (P==0) indicates the base register value is used for the 
address; the offset is then applied to the base register and written back to the 
base register. 


The U bit: Indicates whether the offset is added to the base (U == 1) or subtracted 
from the base (U == 0). 


The B bit: This bit distinguishes between an unsigned byte (B == 1) and a word 


(B == 0) access. 
The W bit: This bit has two meanings: 
if P == if W == 1, the calculated address will be written back to the 
base register. (If W == 0, the base register will not be 
updated.) 
if P == if W == 1, the current access is treated (by the protection 


and memory system) as a user mode access (if W == 0, 
a normal access is performed) 


The L bit: This bit distinguishes between a Load (L == 1) and a Store (L == 0). 
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Immediate offset 


3-100 


[Rn, #+/-<12_bit_offset] 


Description This addressing mode is useful for accessing structure (record) fields, and 
accessing parameters and locals variable in a stack frame. With an offset of zero, 
the address produced is the unaltered value of the base register Rn. 


It calculates an address by adding or subtracting the value of an immediate offset 
to or from the value of the base register Rn. 


28 27 26 25 24 23 22 21 20 19 16.15 12,011 





Operation if U == 1 then 
<address> = Rn + <12_bit_offset> 
else /* U == 0 */ 


<address> = Rn - <12_bit_offset> 


Notes The B bit: This bit distinguishes between an unsigned byte (B==1) and a word 
(B==0) access. 
The L bit: This bit distinguishes between a Load (L==1) and a Store (L==0) 
instruction. 
Use of R15: If R15 is specified as register Rn, the value used is the address of 
the instruction plus 8. 


ARM Architecture Reference Manual 
ARM DUI 0100B 


Ml POWERED 


ARM 


ere > 
[Rn, +/- Rm] (oye (= rd 


Description This addressing mode is used for pointer + offset arithmetic, and accessing Register offset 
a single element of an array. 





It calculates an address by adding or subtracting the value of the index register Rm 
to or from the value of the base register Rn. 





28 27 26 25 24 23 22 21 20 19 16 15 12°11 10 9 
Operation if U == 1 then 
<address> = Rn + Rm 
else /* U == 0 */ 


<address> = Rn - Rm 


Notes Encoding: This addressing mode is encoded as an LSL scaled register offset, 
scaled by zero. 


The B bit: This bit distinguishes between an unsigned byte (B==1) and a word 
(B==0) access. 

The L bit: This bit distinguishes between a Load (L==1) and a Store (L==0) 
instruction. 

Use of R15: If R15 is specified as register Rn, the value used is the address of 


the instruction plus 8. Specifying R15 as register Rm has UNPREDICTABLE 
results. 
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[Rn, +/-Rm, LSL #<shift_imm>] 

[Rn, +/-Rm, LSR #<shift_imm>] 

[Rn, +/-Rm, ASR #<shift_imm>] 

= arlene [Rn, +/-Rm, ROR #<shift_imm>] 
Koyo (= [Rn, +/-Rm, RRX] 


Scaled register offsets Description These addressing modes are used for accessing a single element of an array of 
values larger than a byte. 


They calculate an address by adding or subtracting the shifted or rotated value of 
the index register Rm to or from the value of the base register Rn. 


2827 26 25 24 23 22 21 20 _-(19 16.15 a 


em | = | | mm fap) 


Operation case <shift> of 

00 /* LSL */ 

<index> = Rm Logical_Shift_Left <shift_imm> 
01 /* LSR */ 

<index> = Rm Logical_Shift_Right <shift_imm> 
10 /* ASR */ 

<index> = Rm Arithmetic_Shift_Right <shift_imm> 
11 /* ROR or RRX */ 

if <shift_imm> == 0 then /* RRX */ 

<index> = (C Flag Logical_Shift_Left 31) 
OR (Rm Logical_Shift_Right 1) 





else /* ROR */ 
<index> = Rm Rotate_Right <shift_imm> 
endcase 
if U == 1 then 
<address> = Rn + <index> 
else /* U == 0 */ 
<address> = Rn - <index> 


Notes The B bit: This bit distinguishes between an unsigned byte (B==1) and a word 
(B==0) access. 
The L bit: This bit distinguishes between a Load (L==1) and a Store (L==0) 
instruction. 
Use of R15: If R15 is specified as register Rn, the value used is the address of 
the instruction plus 8. Specifying R15 as register Rm has UNPREDICTABLE 
results. 
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Description This addressing mode is used for pointer access to arrays with automatic update _| /mmediate pre-indexed 
of the pointer value. 
It calculates an address by adding or subtracting the value of an immediate offset 
to or from the value of the base register Rn. 


If the condition specified in the instruction matches the condition code status, 
the calculated address is written back to the base register Rn. The conditions are 
defined in 3.3 The Condition Field on page 3-4. 





28 27 26 25 24 23 22 21 20 19 16 15 12 11 
Operation if U == 1 then 
<address> = Rn + <12_bit_offset> 
else /* if U == 0 */ 





<address> = Rn - <12_bit_offset> 
if ConditionPassed(<cond>) then 
Rn = <address> 


Notes The B bit: This bit distinguishes between an unsigned byte (B==1) and a word 
(B==0) access. 
The L bit: This bit distinguishes between a Load (L==1) and a Store (L==0) 
instruction. 
Use of R15: Specifying R15 as register Rn has UNPREDICTABLE results. 
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Register pre-indexed _ Description 


[Rn, +/- Rm]! 


This addressing mode calculates an address by adding or subtracting the value of 
an index register Rm to or from the value of the base register Rn. 


If the condition specified in the instruction matches the condition code status, 
the calculated address is written back to the base register Rn. The conditions are 
defined in 3.3 The Condition Field on page 3-4. 


28 27 26 25 24 23 22 21 20 19 16.15 12,1110 9 





Operation 


Notes 


3-104 


if U == 1 then 
<address> = Rn + Rm 
else /* U == 0 */ 


<address> = Rn — Rm 
if ConditionPassed(<cond>) then 
Rn = <address> 


The B bit: This bit distinguishes between an unsigned byte (B==1) and a word 
(B==0) access. 

The L bit: This bit distinguishes between a Load (L==1) and a Store (L==0) 
instruction. 

Use of R15: Specifying R15 as register Rm or Rn has UNPREDICTABLE results. 

Operand Restrictions: If the same register is specified for Rn and Rm, the result is 
UNPREDICTABLE. 
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[Rn, +/-Rm, LSL #<shift_imm>] ! 
[Rn, +/-Rm, LSR #<shift_imm>] ! 
[Rn, +/-Rm, ASR #<shift_imm>] ! 
[Rn, +/-Rm, ROR #<shift_imm>]! Po a > 
[Rn, +/-Rm, RRX]! NV foye (=W4 


Scaled register 
pre-indexed 





Description These five addressing modes calculate an address by adding or subtracting 
the shifted or rotated value of the index register Rm to or from the value of the base 
register Rn. 


If the condition specified in the instruction matches the condition code status, 
the calculated address is written back to the base register Rn. The conditions are 
defined in 3.3 The Condition Field on page 3-4. 
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Operation case <shift> of 
00 /* LSL */ 
<index> = Rm Logical_Shift_Left <shift_imm> 
O01 /* LSR */ 
<index> = Rm Logical_Shift_Right <shift_imm> 
10 /* ASR */ 
<index> = Rm Arithmetic_Shift_Right <shift_imm> 
11 /* ROR or RRX */ 
if <shift_imm> == 0 then /* RRX */ 
<index> = (C Flag Logical_Shift_Left 31) 
OR (Rm Logical_Shift_Right 1) 
else /* ROR */ 
<index> = Rm Rotate_Right <shift_imm> 


endcase 
if U == 1 then 
<address> = Rn + <index> 
else /* U == 0 */ 
<address> = Rn - <index> 
if ConditionPassed(<cond>) then 
Rn = <address> 
Notes The B bit: This bit distinguishes between an unsigned byte (B==1) and a word 


(B==0) access. 


The L bit: This bit distinguishes between a Load (L==1) and a Store (L==0) 
instruction. 


Use of R15: Specifying R15 as register Rm or Rn has UNPREDICTABLE results. 


Operand Restrictions: If the same register is specified for Rn and Rm, the result is 
UNPREDICTABLE. 
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3-106 


[Rn], #+/-<12_bit_offset> 


Description This addressing mode is used for pointer access to arrays with automatic update 
of the pointer value. 


It calculates an address from the value of base register Rn. 


If the condition specified in the instruction matches the condition code status, 
the value of the immediate offset added to or subtracted from the value of the base 
register Rn and written back to the base register Rn. The conditions are defined in 
3.3 The Condition Field on page 3-4 


28 27 26 25 24 23 22 21 20 19 16 15 12 11 0 
Operation <address> = 
if BO eae anaes then 
if U == 1 then 
Rn = Rn + <12_bit_offset> 
else /* U == 0 */ 


Rn = Rn —- <12_bit_offset> 
Notes The B bit: This bit distinguishes between an unsigned byte (B==1) and a word 
(B==0) access. 


The L bit: This bit distinguishes between a Load (L==1) and a Store (L==0) 
instruction. 


Use of R15: Specifying R15 as register Rn has UNPREDICTABLE results. 
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Description This addressing mode calculates its address from the value of base register Rn. Register post-indexed 


If the condition specified in the instruction matches the condition code status, 
the value of the index register Rm is added to or subtracted from the value of 
the base register Rn and written back to the base register Rn. The conditions are 
defined in 3.3 The Condition Field on page 3-4. 





28 27 26 25 24 23 22 21 20 19 16 15 12 11 10 9 
Operation <address> = 
pig Je to eas then 
if U == 1 then 
Rn = Rn + Rm 
else /* U == 0 */ 


Rn = Rn - Rm 
Notes The B bit: This bit distinguishes between an unsigned byte (B==1) and a word 
(B==0) access. 


The L bit: This bit distinguishes between a Load (L==1) and a Store (L==0) 
instruction. 


Use of R15: Specifying R15 as register Rn or Rm has UNPREDICTABLE results. 


Operand Restrictions: If the same register is specified for Rn and Rm, the result is 
UNPREDICTABLE. 
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[Rn], +/-Rm, LSL #<shift_imm> 
[Rn], +/-Rm, LSR #<shift_imm> 
[Rn], +/-Rm, ASR #<shift_imm> 


= rolererly [Rn], +/-Rm, ROR #<shift_imm> 
Mode 2 [Rn], +/-Rm, RRX 
Scaled register 


Description If the condition specified in the instruction matches the condition code status, 
the shifted or rotated value of index register Rm is added to or subtracted from 
the value of the base register Rn and written back to the base register Rn. 
The conditions are defined in 3.3 The Condition Field on page 3-4. 


post-indexed 
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Operation <address> = 
case <shift> he 
GO ¢* LS 47 
<index> = Rm Logical_Shift_Left <shift_imm> 
G1 /* LER +f 
<index> = Rm Logical_Shift_Right <shift_imm> 
10 #* ASR */ 
<index> = Rm Arithmetic_Shift_Right <shift_imm> 
11 /* ROR or RRX */ 
if <shift_imm> == 0 then /* RRX */ 
<index> = (C Flag Logical_Shift_Left 31) 
OR (Rm Logical_Shift_Right 1) 
else /* ROR */ 
<index> = Rm Rotate_Right <shift_imm> 
endcase 
if ConditionPassed(<cond>) then 
if U == 1 then 
Rn = Rn + <index> 
else /* U == 0 */ 
Rn = Rn —- <index> 


Notes The B bit: This bit distinguishes between an unsigned byte (B==1) and a word 
(B==0) access. 
The L bit: This bit distinguishes between a Load (L==1) and a Store (L==0) 
instruction. 
Use of R15: Specifying R15 as register Rm or Rn has UNPREDICTABLE results. 


Operand Restrictions: If the same register is specified for Rn and Rm, the result is 
UNPREDICTABLE. 


3-108 ARM Architecture Reference Manual 
ARM DUI 0100B 


a 
a 
Fa 
au 
= 
6 
a 
a) 


ARM 





PXele[(-t-<-j1 ae] > 
iM foce [IX 
3.18 Load and Store Halfword or Load Signed Byte General encoding 
Addressing Modes 
Architecture v4 only 


There are six addressing modes which are used to calculate the address for a load 
and store (signed or unsigned) halfword or load signed byte instructions. 





















































1 Immediate offset page 3-110 
LDR|STR{<cond>}H|SH|SB_ Rad, Rn, #+/-<8_bit_offset>] 

2 Register offset page 3-111 
LDR|STR{<cond>}H|SH|SB_ Rd, Rn, +/—-Rm] 

3 Immediate pre-indexed page 3-112 
LDR|STR{<cond>}H|SH|SB_ Rad, Rn, #+/-<8_bit_offset>]! 

4 Register pre-indexed page 3-113 
LDR|STR{<cond>}H|SH|SB_ Rd, Rn, +/-Rm]! 

5 Immediate post-indexed page 3-114 
LDR|STR{<cond>}H|SH|SB_ Rd, Rn], #+/-<8_bit_offset> 

6 Register post-indexed page 3-115 
LDR|STR{<cond>}H|SH|SB Rd, [Rn], +/-Rm 


Immediate offset/index 
31 28 27 26 25 24 23 22 21 20 19 16 15 12 11 8 7 6 5 4 


cond 0 0 O;PJUJ1/W]L Rn Rd immedH 1/S]H]41 ImmedL 


Register offset/index 











lo lo 
3 
lo lo 
































31 28 27 26 25 24 23 22 21 20 19 16 15 12.11 Bk 6:5 = 4 
cond 0 0 O|P/UJO;W]L Rn Rd SBZ 1/S|H] 1 R 
Notes The P bit: Pre-indexing (P==1) indicates the offset is applied to the base register, 


and the result is used as the address. 

Post-indexing (P==0) indicates the base register value is used for the 
address; the offset is then applied to the base register and written back to the 
base register. 


The U bit: Indicates whether the offset is added to the base (U==1) or subtracted 
from the base(U==0). 


The W bit: If P is set, W indicates that the calculated address will be written back 
to the base register; if P is clear, the W bit must be clear or the instruction is 
UNPREDICTABLE. 

The L bit: This bit distinguishes between a Load (L==1) and a Store (L==0) 
instruction. 

The S bit: This bit distinguishes between a signed (S==1) and an unsigned (S==0) 
halfword access. 

The H bit: This bit distinguishes between a halfword (H==1) and a signed byte 
(H==0) access. 
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Immediate offset Description 


Architecture v4 only 


[Rn, #+/-<8_bit_offset>] 


This addressing mode is used for accessing structure (record) fields, and 
accessing parameters and locals variable in a stack frame. With an offset of zero, 
the address produced is the unaltered value of the base register Rn. 


It calculates an address by adding or subtracting the value of an immediate offset 
to or from the value of the base register Rn. 


28 27 26 25 24 23 22 21 20 19 16.15 12°11 


ee keene oe 





Operation 


Notes 


3-110 


<8_bit_offset> = (<immedH> << 4) OR <immedL> 
if U == 1 then 

<address> = Rn + <8_bit_offset> 
else /* U == 0 */ 


<address> = Rn - <8_bit_offset> 


The L bit: This bit distinguishes between a Load (L==1) and a Store (L==0) 
instruction. 

The S bit: This bit distinguishes between a signed (S==1) and an unsigned (S==0) 
halfword access. 

The H bit: This bit distinguishes between a halfword (H==1) and a signed byte 
(H==0) access. 

Use of R15: If R15 is specified as register Rn, the value used is the address of 
the instruction plus 8. 
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Description This addressing mode is useful for pointer + offset arithmetic, and for accessing Register offset 
a single element of an array. 


It calculates an address by adding or subtracting the value of the index register Rm 
to or from the value of the base register Rn. 





Architecture v4 only 





28 27 26 25 24 23 22 21 20 19 16 15 12 11 
Operation if U == 1 then 
<address> = Rn + Rm 
else /* U == 0 */ 
<address> = Rn - Rm 
Notes The L bit: This bit distinguishes between a Load (L==1) and a Store (L==0) 


instruction. 


The S bit: This bit distinguishes between a signed (S==1) and an unsigned (S==0) 
halfword access. 


The H bit: This bit distinguishes between a halfword (H==1) and a signed byte 
(H==0) access. 
Use of R15: If R15 is specified as register Rn, the value used is the address of 


the instruction plus 8. Specifying R15 as register Rm has UNPREDICTABLE 
results. 
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Immediate pre-indexed _ Description This addressing mode gives pointer access to arrays, with automatic update of 
the pointer value. 
AlchiBeh ee It calculates an address by adding or subtracting the value of an immediate offset 
oes aa, to or from the value of the base register Rn. 


If the condition specified in the instruction matches the condition code status, 
the calculated address is written back to the base register Rn. The conditions are 
defined in 3.3 The Condition Field on page 3-4. 





28 27 26 25 24 23 22 21 20 19 16 15 12 11 
CA ee 
Operation <8_bit_offset> = (<immedH> << 4) OR <immedL> 
if U == 1 then 
<address> = Rn + <8_bit_offset> 
else /* U == 0 */ 
<address> = Rn - <8_bit_offset> 
if ConditionPassed(<cond>) then 
Rn = <address> 
Notes The L bit: This bit distinguishes between a Load (L==1) and a Store (L==0) 
instruction. 


The S bit: This bit distinguishes between a signed (S==1) and an unsigned (S==0) 
halfword access. 

The H bit: This bit distinguishes between a halfword (H==1) and a signed byte 
(H==0) access. 

Use of R15: Specifying R15 as register Rn has UNPREDICTABLE results. 
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[Rn, +/- Rm]! Mode 3 


Description This addressing mode calculates an address by adding or subtracting the value of Register pre-indexed 
the index register Rm to or from the value of the base register Rn. 


If the condition specified in the instruction matches the condition code status, Arena PRGNonI 
the calculated address is written back to the base register Rn. The conditions are y 
defined in 3.3 The Condition Field on page 3-4. 








28 27 26 25 24 23 22 21 20 19 16 15 12 11 
ome oes ee) » [ ™ ] = Sep] = | 
Operation if U == 1 then 
<address> = Rn + Rm 
else /* U == 0 */ 


<address> = Rn —- Rm 
if ConditionPassed(<cond>) then 


Rn = <address> 
Notes The L bit: This bit distinguishes between a Load (L==1) and a Store (L==0) 
instruction. 


The S bit: This bit distinguishes between a signed (S==1) and an unsigned (S==0) 
halfword access. 


The H bit: This bit distinguishes between a halfword (H==1) and a signed byte 
(H==0) access. 


Use of R15: Specifying R15 as register Rm or Rn has UNPREDICTABLE results. 
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Mode 3 [Rn], #+/-<8_bit_offset> 
mies Description This addressing mode gives pointer access to arrays, with automatic update of 


the pointer value. 


It calculates an address from the value of base register Rn. 


If the condition specified in the instruction matches the condition code status, 
the value of the immediate offset is added to or subtracted from the value of 

the base register Rn and written back to the base register Rn. The conditions are 
defined in 3.3 The Condition Field on page 3-4. 


Architecture v4 only 
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Operation <address> = 
ee = (<immedH> << 4) OR <immedL> 
if ConditionPassed(<cond>) then 
if U == 1 then 
Rn = Rn + <8_bit_offset> 
else /* U == 0 */ 
Rn = Rn —- <8_bit_offset> 
Notes The L bit: This bit distinguishes between a Load (L==1) and a Store (L==0) 
instruction. 


The S bit: This bit distinguishes between a signed (S==1) and an unsigned (S==0) 
halfword access. 


The H bit: This bit distinguishes between a halfword (H==1) and a signed byte 
(H==0) access. 


Use of R15: Specifying R15 as register Rn has UNPREDICTABLE results. 
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Description If the condition specified in the instruction matches the condition code status, Register post-indexed 
the value of the index register Rm is added to or subtracted from the value of 
the base register Rn and written back to the base register Rn. The conditions are 





defined in 3.3 The Condition Field on page 3-4. Architecture v4 only 
28 27 26 25 24 23 22 21 20 19 16 15 12 11 
Operation <address> = 
if oo, aera, then 
if U == 1 then 
Rn = Rn + Rm 
else /* U == 0 */ 
Rn = Rn - Rm 
Notes The L bit: This bit distinguishes between a Load (L==1) and a Store (L==0) 
instruction. 


The S bit: This bit distinguishes between a signed (S==1) and an unsigned (S==0) 
halfword access. 


The H bit: This bit distinguishes between a halfword (H==1) and a signed byte 
(H==0) access. 


Use of R15: Specifying R15 as register Rm or Rn has UNPREDICTABLE results. 
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3.19 Load and Store Multiple Addressing Modes 


General encoding 


3-116 


Load Multiple instructions load a subset (possibly all) of the general-purpose 
registers from memory. Store Multiple instructions store a subset (possibly all) of 
the general purpose registers to memory. These instructions have a single 
instruction format. 


Load and Store Multiple addressing modes produce a sequential range of 
addresses. The lowest-numbered register is stored at the lowest memory address 
and the highest-numbered register at the highest memory address. 


There are four Load and Store Multiple addressing modes: 








1 Increment After page 3-117 
LDM|STM{<cond>}IA Rn{!}, <registers>{*} 

2 Increment Before page 3-118 
LDM|STM{<cond>}IB Rn{!}, <registers>{*} 

3 Decrement After page 3-119 
LDM|STM{<cond>}DA Rn{!}, <registers>{*} 

4 Decrement Before page 3-120 
LDM|STM{<cond>}DB Rn{!}, <registers>{*} 











28 27 26 25 24 23 22 21 20 19 16.15 0 
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Notes 


The register list: The <register_list> has 1 bit for each general-purpose 
register; bit 0 is for register zero, bit 15 is for register 15 (the PC). 
The <register_list> is specified in the instruction mnemonic using 
a comma-separated list of registers, surrounded by brackets. If no bits are set, 
the result is UNPREDICTABLE. 


The U bit: Indicates that the transfer is made upwards (U==1) or downwards 
(U==0) from base register. 
The P bit: Pre-indexing or post-indexing: 
P== indicates that each address in the range is incremented (U==1) or 
decremented (U==0) before it is used to access memory. 
P==0 _ indicates that each address in the range is incremented (U==1) or 
decremented (U==0) after it is used to access memory. 


The W bit: Indicates that the base register will be updated after the transfer. The 
base register is incremented (U==1) or decremented (U==0) by four times 
the number of registers in the register list. 


The S bit: For LDMs that load the PC, the S bit indicates that the CPSR is loaded 
from the SPSR. For LDMs that do not load the PC and all STMs, the S bit 
indicates that when the processor is in a privileged mode, the user mode 
banked registers are transferred and not the registers of the current mode. 


The L bit: Distinguishes between Load (L==1) and Store (L==0) instructions. 
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Description This addressing mode is for Load and Store multiple instructions, and forms Increment after 
a range of addresses. 


The first address formed is the <start_address>, and is the value of the base 
register Rn. Subsequent addresses are formed by incrementing the previous 
address by four. One address is produced for each register that is specified in 
<register_list>. 


The last address produced is the <end_address>; its value is four less than 
the sum of the value of the base register and four times the number of registers 
specified in <register_list>. 


If the condition specified in the instruction matches the condition code status and 
the W bit is set, Rn is incremented by four times the numbers of registers in 
<register_list>. The conditions are defined in 3.3 The Condition Field on 


page 3-4. 
28 27 26 25 24 23 22 21 20 19 16 15 0 
ome beeen aoa e 
Operation <start_address> = 
<end_address> = Rn + (Number_Of_Set_Bits_In(<register_list>) * 4) - 4 
if ConditionPassed(<cond>) and W == 1 then 


Rn = Rn + (Number_Of_Set_Bits_In(<register_list>) * 4) 
Qualifiers ! sets the W bit, causing base register update 
* is used to set the S bit, see below. 


Notes The L bit: This bit distinguishes between a load multiple and a store multiple. 


The S bit: The action of the S bit is described in section 3.19 Load and Store 
Multiple Addressing Modes on page 3-116. 
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Increment before 


3-118 


LDM|STM{<cond>}IB Rd{!}, <registers>{%*} 


Description This addressing mode is for Load and Store multiple instructions, and forms 
a range of addresses. 


The first address formed is the <start_address>, and is the value of the base 
register Rn plus four. Subsequent addresses are formed by incrementing 

the previous address by four. One address is produced for each register that is 
specified in <register_list>. 


The last address produced is the <end_address>; its value is the sum of 
the value of the base register and four times the number of registers specified in 
<register_list>. 


If the condition specified in the instruction matches the condition code status and 
the W bit is set, Rn is incremented by four times the numbers of registers in 
<register_list>. The conditions are defined in 3.3 The Condition Field on 





page 3-4. 
28 27 26 25 24 23 22 21 20 19 16 15 
ome reel see] om eae ie 
Operation <start_address> = Rn + 4 
<end_address> = Rn + (Number_Of_Set_Bits_In(<register_list>) * 4) 
if ConditionPassed(<cond>) and W == 1 then 


Rn = Rn + (Number_Of_Set_Bits_In(<register_list>) * 4) 
Qualifiers ! sets the W bit, causing base register update 
‘ is used to set the S bit, see below. 


Notes The L bit: This bit distinguishes between a load multiple and a store multiple. 


The S bit: The action of the S bit is described in section 3.19 Load and Store 
Multiple Addressing Modes on page 3-116. 
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Description This addressing mode is for Load and Store multiple instructions, and forms Decrement after 
a range of addresses. 


The first address formed is the <start_address>, and is the value of the base 
register minus four times the number of registers specified in <register_list>, 
plus 4. Subsequent addresses are formed by incrementing the previous address 
by four. One address is produced for each register that is specified in 
<register_list>. 


The last address produced is the <end_address>; its value is the value of base 
register Rn. 


If the condition specified in the instruction matches the condition code status and 
the W bit is set, Rn is decremented by four times the numbers of registers in 
<register_list>. The conditions are defined in 3.3 The Condition Field on 








page 3-4. 
28 27 26 25 24 23 22 21 20 19 16 15 0 
Operation 
<start_address> = Rn —- (Number_Of_Set_Bits_In(<register_list>) * 4) + 4 
<end_address> = Rn 
if ConditionPassed(<cond>) and W == 1 then 
Rn Rn (Number_Of_Set_Bits_In(<register_list>) * 4) 

Qualifiers ! sets the W bit, causing base register update 

‘ is used to set the S bit, see below. 
Notes The L bit: This bit distinguishes between a load multiple and a store multiple. 


The S bit: The action of the S bit is described in section 3.19 Load and Store 
Multiple Addressing Modes on page 3-116. 
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Decrement before 
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LDM|STM{<cond>}DB Rd{!}, <registers>{%*} 


Description This addressing mode is for Load and Store multiple instructions which form 
a range of addresses. 


The first address formed is the <start_address>, and is the value of the base 
register minus four times the number of registers specified in <register_list>. 
Subsequent addresses are formed by incrementing the previous address by four. 
One address is produced for each register that is specified in <register_list>. 


The last address produced is the <end_address>; its value is the value of base 
register Rn minus four. 


If the condition specified in the instruction matches the condition code status and 
the W bit is set, Rn is decremented by four times the numbers of registers in 
<register_list>. The conditions are defined in 3.3 The Condition Field on 


page 3-4. 
28 27 26 25 24 23 22 21 20 19 16 15 0 
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Operation <start_address> = Rn - (Number_Of_Set_Bits_In(<register_list>) * 4 

<end_address> = Rn - 4 

if ConditionPassed(<cond>) and W == 1 then 

Rn = Rn — (Number_Of_Set_Bits_In(<register_list>) * 4) 

Qualifiers ! sets the W bit, causing base register update 


‘ is used to set the S bit, see below. 


Notes The L bit: This bit distinguishes between a load multiple and a store multiple. 


The S bit: The action of the S bit is described in section 3.19 Load and Store 
Multiple Addressing Modes on page 3-116. 


ARM Architecture Reference Manual 
ARM DUI 0100B 


a 
a 
Fa 
au 
= 
6 
a 
a) 


ARM 





Addressing > 
i foce [= 


3.20 Load and Store Multiple Addressing Modes GROEN SS 
(Alternative names) 


3.20.1 Block data transfer 


The four addressing mode names given in 3.18 Load and Store Halfword or 
Load Signed Byte Addressing Modes on page 3-109 (IA, IB, DA, DB) are most 
useful when a load and store multiple instruction is being used for block data 
transfer, as it is likely that the Load Multiple and Store Multiple will have the same 
addressing mode, so that the data is stored in the same way that it was loaded. 


However, if Load Multiple and Store Multiple are being used to access a stack, 
the data will not be loaded with the same addressing mode that was used to store 
the data, because the load (pop) and store (push) operations must adjust the stack 
in opposite directions. 


3.20.2 Stack operations 


Load Multiple and Store Multiple addressing modes may be specified with an 
alternative syntax, which is more applicable to stack operations. Two attributes are 
used to describe the stack. 


Full or Empty 
Full is defined to have the stack pointer pointing to the last used 
(full) location in the stack 
Empty is defined to have the stack pointer pointing to the first 


unused (empty) location in the stack 
Ascending or Descending 


Descending grows towards decreasing memory address 
(towards the bottom of memory) 
Ascending grows towards increasing memory address 


(towards the top of memory) 
This allows four types of stack to be defined: 
1 Full Descending (FD) 
2 Empty Descending (ED) 
3 Full Ascending (FA) 
4 Empty Ascending (EA) 


Table 3-1: LDM/STM addressing modes shows the relationship between the four 
types of stack, the four types of addressing mode shown above, and the L, U and 
P bits in the instruction format: 
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Alternative names 
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Instruction 
LDM (Load) 
STM (Store) 
LDM (Load) 
STM (Store) 
LDM (Load) 
STM (Store) 
LDM (Load) 

( 


STM (Store) 





Addressing Mode 
Increment After) 


IA ( 

IA (Increment After) 
IB (Increment Before 
IB ( 


) 
Increment Before) 
DA (Decrement After) 
DA (Decrement After) 
DB (Decrement Before) 
DB ( 


Decrement Before) 





Stack Type 
Full Descending) 
Empty Ascending) 
Empty Descending) 


FD ( 

EA ( 

ED ( 

FA (Full Ascending) 
FA (Full Ascending) 

ED (Empty Descending) 
EA (Empty Ascending) 
FD ( 


Full Descending) 


Table 3-1: LDM/STM addressing modes 


L bit | P Bit 
1 0 
0 0 
1 1 
0 1 
1 0 
0 0 
1 1 
0 1 
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3.21 Load and Store Coprocessor Addressing Modes General encoding 


There are three addressing modes which are used to calculate the address of 
a load or store coprocessor instruction: 


1 Immediate offset page 3-124 
<opcode>{<cond>} {L}p<cp#>, CRd, [Rn, #+/-(<8_bit_offset>*4) ] 

2 Immediate pre-indexed page 3-125 
<opcode>{<cond>}{L} p<cp#>, CRd, [Rn, #+/-(<8_bit_offset>*4) ]! 

3 Immediate post-indexed page 3-126 


<opcode> {<cond>}{L}p<cp#>,CRd, [Rn], #+/-(<8_bit_offset>*4) 


2827 26 25 =~ 24 23 22 21 20-19 16.15 12,041 


Notes The P bit: Pre-indexing (P==1) or post-indexing (P==0): 
(P==1) indicates that the offset is added to the base register, and the result 
is used as the address. 


(P==0) indicates that the base register value is used for the address; the 
offset is then added to the base register and written back to the base 
register (because W will equal 1, see below). 

The U bit: Indicates that the offset is added to the base (U==1) or that the offset is 
subtracted from the base (U==0). 

The N bit: The meaning of this bit is coprocessor-dependent; its recommended 
use is to distinguish between different-sized values to be transferred. 

The W bit: This indicates that the calculated address will be written back to 
the base register. If P is 0, W must equal 1 or the result is UNPREDICTABLE. 


The L bit: Distinguishes between Load (L==1) and Store (L==0) instructions. 
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Immediate offset Description This addressing mode produces a sequence of consecutive addresses. 


The first address is calculated by adding or subtracting 4 times the value of 

an immediate offset to or from the value of the base register Rn. The subsequent 
addresses in the sequence are produced by incrementing the previous address by 
four until the coprocessor signals the end of the instruction. This allows 

a coprocessor to access data whose size is coprocessor-defined. 


The coprocessor must not request a transfer of more than 16 words. 


28 27 26 25 24 23 22 21 20 19 16.15 12414 





Operation if ConditionPassed(<cond>) then 
if U == 1 then 
<address> = Rn + <8_bit_offset> * 4 
else /* U == 0 */ 
<address> = Rn - <8_bit_offset> * 4 
<start_address> = <address> 
while (NotFinished (coprocessor [<cp_num>]) ) 
<address> = <address> + 4 
<end_address> = <address> 





Notes The N bit: This bit is coprocessor-dependent. 
The L bit: Distinguishes between Load (L==1) and Store (L==0) instructions. 


Use of R15: If R15 is specified as register Rn, the value used is the address of 
the instruction plus 8. 
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Description This addressing mode produces a sequence of consecutive addresses. Immediate pre-indexed 


The first address is calculated by adding or subtracting 4 times the value of 

an immediate offset to or from the value of the base register Rn. The first address 
is written back to the base register Rn. The subsequent addresses in the sequence 
are produced by incrementing the previous address by four until the coprocessor 
signals the end of the instruction. This allows a coprocessor to access data whose 
size is coprocessor-defined. 


The coprocessor must not request a transfer of more than 16 words. 





28 27 26 25 24 23 22 21 20 19 16 15 12 11 
Operation if ConditionPassed(<cond>) then 
if U == 1 then 
Rn = Rn + <8_bit_offset> * 4 
else /* U == 0 */ 


Rn = Rn - <8_bit_offset> * 4 
<start_address> = Rn 
<address> = <start_address> 
while (NotFinished (coprocessor [<cp_num>] ) ) 
<address> = <address> + 4 
<end_address> = <address> 


Notes The N bit: This bit is coprocessor-dependent. 


The L bit: Distinguishes between Load (L==1) and Store (L==0) instructions. 


Use of R15: If R15 is specified as register Rn, the value used is the address of 
the instruction plus 8. 
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ied S Description This addressing mode produces a sequence of consecutive addresses. 
The first address is the value of the base register Rn. The subsequent addresses 
in the sequence are produced by incrementing the previous address by four until 
the coprocessor signals the end of the instruction. This allows a coprocessor to 
access data whose size is coprocessor-defined. 


The base register Rn is updated by adding or subtracting 4 times the value of 
an immediate offset to or from the value of the base register Rn. 


The coprocessor must not request a transfer of more than 16 words. 


2827 26 25 24 23 22 21 20-19 16.15 12,011 


Operation if ConditionPassed(<cond>) then 

<start_address> = Rn 

if U == 1 then 
Rn = Rn + <8_bit_offset> * 4 

else /* U == 0 */ 
Rn = Rn —- <8_bit_offset> * 4 

<address> = <start_address> 

while (NotFinished (coprocessor [<cp_num>]) ) 
<address> = <address> + 4 

<end_address> = <address> 





Notes The N bit: This bit is coprocessor-dependent. 
The L bit: Distinguishes between Load (L==1) and Store (L==0) instructions. 


Use of R15: If R15 is specified as register Rn the value used is the address of 
the instruction plus 8. 


The W bit: If bit 21 (the Writeback bit) is not set, the result is UNPREDICTABLE. 


3-126 ARM Architecture Reference Manual 
ARM DUI 0100B 


a 
a 
Fa 
au 
= 
6 
a 
a) 


ARM 








ARM Code Sequences 


POWERED 


ARMz 


ARM Code Sequences 





The ARM instruction set is a powerful tool for generating high-performance 
microprocessor systems. Used to full extent, the ARM instruction set allows very 
compact and efficient algorithms to be coded. This chapter contains some sample 
routines that provide insight into the ARM instruction set. 


4.1 Arithmetic Instructions 4-2 
4.2 Branch Instructions 4-4 
4.3 Load and Store Instructions 4-6 
4.4 Load and Store Multiple Instructions 4-8 
4.5 Semaphore Instructions 4-9 
4.6 Other Code Examples 4-10 
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4.1.1 


4.1.2 


4-2 


Arithmetic Instructions 


The following code sequences illustrate some ways of using ARM's data-processing 
instructions. 


Bit field manipulation 


ARM shift and logical operations are very useful for bit manipulation: 


’ 
’ 


’ 


Extract 8 bits from the top of R2 and insert them into 

the bottom of R3 

RO is a temporary value 

MOV RO, R2, LSR #24 ; extract top bits from R2 into RO 
ORR R3, RO, R3, LSL #8; shift up R3 and insert RO 





Multiplication by constant 


Combinations of shifts, add with shifts, and reverse subtract with shift can be used to 
perform multiplication by constant: 


; multiplication of RO by 2” 


; multiplication of RO by 2° + 1 


; multiplication of RO by 2" - 1 


; RO 


MOV RO, RO, LS n ; RO = RO << n 





ADD RO, RO, RO, LSL #n ; RO = RO + RO <<n 





RSB RO, RO, LSL #n ; RO = RO << n - RO 








= RO LO RL 
ADD RO, RO, RO, LSL #4 7 RO = RO * 5 
ADD RO, R1, RO, LSL #1 ; RO = R1 + RO * 2 








= RO * 100 + R1, R2 is destroyed 





























ADD R2, RO, RO, LSL #3 ; R2 = RO * 9 
ADD RO, R2, RO, LSL #4 ; RO = R2 + RO * 16 (RO = RO * 25) 
ADD RO, Rl, RO, LSL #2 ; RO = RL + RO * 4 
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4.1.3 Multi-precision arithmetic 


Arithmetic instructions allow efficient arithmetic on 64-bit or larger objects: 
Add, and Add with Carry perform multi-precision addition 
Subtract, and Subtract with Carry perform subtraction 


Compare can be used for comparison 


; On entry :RO and R1 hold a 64-bit number 
; (RO is least significant) 


; :R2 and R3 hold a second 64-bit number 
; On exit :RO and R1 hold the 64-bit sum (or difference) of the 2 numbers 
add64 ADDSRO, RO, R2 ; add lower halves and update Carry flag 
ADC R1, Rl, R3 ; add the high halves and Carry flag 
sub64 SUBSRO, RO, R2 ; subtract lower halves, update Carry 
SBC R1, Rl, R3 ; subtract high halves and Carry 


; This routine compares two 64-bit numbers 

; On entry: As above 

; On exit: N, Z, C and V flags updated correctly 

cmp64 CMP R1, R3 ; compare high halves, if they are 
CMPEQRO, R2 7 equal, then compare lower halves 


4.1.4 Swapping endianness 


> 
x 
4 
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Swapping the order of bytes in a word (the endianness) can be performed in two ways. 


1. The first method is best for single words: 


; On entry:RO holds the word to be swapped 
; On exit: RO holds the swapped word, Rl is destroyed 


byteswap pe RO = Asp Bey: Clryo (D 
EOR Rl, RO, RO, ROR #16 ; Rl = A*C,B*D,C*A,D*B 
BIC Rl, Rl, #0x££0000 ; Rl = A%C, 0 ,C*A,D*B 
MOV RO, RO, ROR #8 ; RO=D,A,B,C 
EOR RO, RO, Rl, LSR #8 + ROS DO Bw 


2 The second method is best for swapping the endianness of a large number of 
words: 


; On entry: RO holds the word to be swapped 
; On exit : RO holds the swapped word, 


; : Rl, R2 and R3 are destroyed 

byteswap ; first the three instruction initialisation 
MOV R2, #0Oxff ; R2 = Oxff 
ORR R2, R2, #0xff£0000 ; R2 = Ox00ffO00fF 
MOV R3, R2, LSL #8 ; R3 = OxffO00ff00 
; repeat the following code for each word to swap 

; RO= A B Cc OD 
AND Rl, R2, RO, ROR #24 ; Rl = O Cc 0 A 
AND RO, R3, RO, ROR #8 ; RO = D 0 B 0 
ORR RO, RO, R1 ; RO =D Cc B A 
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4.2 Branch Instructions 


The following code sequences show some different ways of controlling the flow of 
execution in ARM code. 


4.2.1. Procedure call and return 


The BL (Branch and Link) instruction makes a procedure call by preserving the address 
of the instruction after the BL in R14 (the link register or LR), and then branching to the 
target address. Returning from a procedure is achieved by moving R14 to the PC: 


BL function ; call ‘function’ 
; procedure returns to here 


function ; function body 


MOV. PC, DR ; Put R14 into PC to return 


4.2.2 Conditional execution 


Conditional execution allow if-then-else statements to be collapsed into sequences that 
do not require forward branches: 


/* C code for Euclid’s Greatest Common Divisor (GCD) */ 
/* Returns the GCD of its two parameters */ 
int gcd(int a, int b) 
{ while (a != b) 
HE) (a> Bb.) 
a=a-b; 





else 
b =-b - a ; 
return a; 





; ARM assembler code for Euclid’s Greatest Common Divisor 

; On entry: RO holds ‘a’, Rl holds ‘b’ 

; On exit : RO hold GCD of A and B 

gcd CMP RO, R1 ; compare ‘a’ and ‘b’ 
SUBGT RO, RO, Rl; if (a>b) a=a-b (if a==b do nothing) 
SUBLT Rl, Rl, RO; if (b>a) b=b-a (if a==b do nothing) 








BNE gcd ; if (a!=b) then keep going 
MOV PC» LR ; return to caller 
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4.2.3. Conditional compare instructions 
Compare instructions can be conditionally executed to implement more complicated 
expressions: 


if (a==0 || b==1) 
c=dte,; 





CMP RO, #0 ; compare a with 0 
CMPNE R1, #1 ; if a is not 0, compare b to 1 
ADDEQ R2, R3, R4 ; if either was true c=dte 





4.2.4 Loop variables 


The subtract instruction can be used to both decrement a loop counter and set the 
condition codes to test for a zero. 


MOV RO, #loopcount ; initialise the loop counter 
loop ; loop body 
SUBS RO, RO, #1 ; subtract 1 from counter 
; set condition codes 
BNE loop ; if not zero, continue looping 





4.2.5 Multi-way branch 


A very simple multi-way branch can be implemented with a single instruction. 

The following code dispatches the control of execution to any number of routines, with 
the restriction that the code to handle each case of the multi-way branch is the same 
size, and that size is a power of two bytes. 


; Multi-way branch 
; On entry: RO holds the branch index 
gcd CMP RO, #maxindex j; (optional) checks the index is in range 
ADDLT PC, PC, RO, LSL #RoutineSizeLog2 
; scale index by the log of the size of 
; each handler; add to the PC; jump there 
B IndexOutOfRange; jump to the error handler 
Index0OHandler 


IndexlHandler 


Index2Handler 








Index3Handler 
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4.3. Load and Store Instructions 
Load and Store instructions are the best way to load or store a single word. They are 
also the only instructions that can load or store a byte or halfword. 


4.3.1 Simple string compare 


The following code performs a very simple string compare on two zero-terminated 
strings. 


; String compare 
; On entry : RO points to the first string 


; : Rl points to the second string 

: : Call this code with a BL 

7; On exit : RO is < 0 if the first string is less than the second 

7 : RO is = 0 if the first string is equal to the second 

; : RO is > 0 if the first string is greater than the second 
7 : Rl, R2 and R3 are destroyed 

strcmp 


LDRB R2, [RO] #1 ; get a byte from the first string 
LDRB R3, [R1] #1; get a byte from the second string 


CMP R2, #0; reached the end? 
BEQ return; go to return code; calculate return value 
SUBS RO, R2, R3; compare the two bytes 
BEQ strcmp; if they are equal, keep looking 
return 
MOV PC, LR; put R14 (LR) into PC to return 


Much faster implementations of this code are possible by loading a word of each string 
at a time and comparing all four bytes. 


4.3.2 Linked lists 


The following code searches for an element in a linked list. The linked list has two 
elements in each record; a single byte value and a pointer to the next record. A null next 
pointer indicates this is the last element in the list. 


; Linked list search 
; On entry: RO holds a pointer to the first record in the list 
; Rl holds the byte we are searching for 
; : Call this code with a BL 
On exit : RO hold the address of the first record matched 
: R2 is destroyed 
or a null pointer if no match was found 


llsearch 

CMP RO, #0 ; null pointer? 
LDRNEBR2, [RO] ; load the byte value from this record 
CMPNERI1, R2 ; compare with the lokked-for value 
LDRNERO, [RO, #4] ; if not found, follow the link to the 
BNE llsearch ; next record and then keep looking 
MOV PC, LR ; return with pointer in RO 
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4.3.3. Long branch 


A load instruction can be used to generate a branch to anywhere in the 4Gbyte address 
space. By manually setting the value of the link register(R14), a subroutine call can be 
made to anywhere in the address space. 


; Long branch (and link) 
ADD LR, PC, #4 ; set the return address to be 8 byte 
; after the next instruction 
LDR PC, [PC, #-4] ; get the address from the next word 
DCD function ; store the address of the function 
; (DCD is an assembler directive) 
; return to here 


This code uses the location after the load to hold the address of the function to call. 

In practice, this location can be accessing as long as it is within 4Kbytes of the load 
instruction. Notice also, that this code is position-independent except for the address of 
the function to call. Full position-independence can be achieved by storing the offset of 
the branch target after the load, and using an ADD instruction to add it to the PC. 





4.3.4 Multi-way branches 


The following code improves on the multi-way branch code shown above by using 
a table of addresses of functions to call. 


; Multi-way branch 
; On entry: RO holds the branch index 





CMP RO, #maxindex ; (optional) checks the index is in range 

LDRLT PC, [PC, LSL #2] ; convert the index to a word offset 
; do a look up in the table put the loaded 
; value into the PC and jump there 

B IndexOutOfRange ; jump to the error handler 

DCD Handler0O ; DCD is an assembler directive to 

DCD Handlerl ; store a word (in this case an 

DCD Handler2 ; address in memory. 

DCD Handler3 
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4.4 Load and Store Multiple Instructions 


Load and Store Multiple instructions are the most efficient way to manipulate blocks of 
data. 


4.4.1 Simple block copy 


This code performs very simple block copy, 48 bytes at a time, and will approach the 
maximum throughput for a particular machine. The source and destination must be 
word-aligned, and objects with less than 48 bytes must be handled separately. 


; Simple block copy function 
; R12 points to the start of the source block 
; R13 points to the start of the destination block 
; R14 points to the end of the source block 
loop LDMIA R12!, (RO-R11} ; load 48 bytes 
STMIA R13!, {RO-R11} ; store 48 bytes 
CMP R12, R14 ; reached th nd yet? 
BLT loop ; branch to the top of the loop 





4.4.2 Procedure entry and exit 


This code uses load and store multiple to preserve and restore the processor state 
during a procedure. The code assumes the register r0 to r3 are argument registers, 
preserved by the caller of the function, and therefore do not need to be preserved. 
R13 is also assumed to point to a full descending stack. 


function 
STMFD R13!, {R4 - R12, R14}; preserve all the local registers 
; and the return address, and 
; update the stack pointer. 


; function body 


LDMFD R13!, {R4 - R12, PC} ; restore the local register, load 
; the PC from the saved return 
7 update the stack pointer. 
Notice that this code restores all saved registers, updates the stack pointer, and returns 
the caller (by loading the PC value) all in with single instruction. This allows very efficient 
conditional return for exceptional cases from a procedure (by checking the condition 
with a compare instruction and the conditionally executing the load multiple). 
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This code controls the entry and exit from a critical section of code. The semaphore 
instructions do not provide a compare and conditional write facility; this must be done 
explicitly. The following code achieves this by using a semaphore value to indicate that 
the lock is being inspected. 


The code below causes the calling process to busy-wait until the lock is free; to ensure 
progress, three OS calls need to be made (one before each loop branch) to sleep 
the process if the lock cannot be accessed. 


; Critical section entry and exit 

; The code uses a process ID to identify the lock owner 
; An ID of zero indicates the lock is free 

; An ID of -1 indicates the lock is being inspected 


7 On entry: 


RO holds the address of the semaphore 
Rl holds the ID of the process requesting the lock 





MVN R2, #0 ; load the ‘looking’ value (-1) in R2 
spinin SWP R3, R2, [RO] ; look at the lock, and lock others out 
CMN R3, #1 ; anyone else trying to look? 

conditional OS call to sleep process 
BEQ spinin ; yes, so wait our turn 
CMP R3, #0 ; no-one looking, is the lock free? 
STRNE R3, [RO] ; no, then restore the previous owner 
conditional OS call to sleep process 
BNE spinin ; and wait again 
STR R1, [RO] ; otherwise grab the lock 





spinout SWP R3, R2, [RO] 7 look at the lock, and lock others out 
CMN R3, #1 ; anyone else trying to look ? 
: conditional OS call to sleep process 
BEQ spinout ; yes, so wait our turn 
CMP R3, R1 ; check we own it 
BNE CorruptSemaphore ; we should have been the owner! 
MOV R2, #0 ; load the ‘free’ value 
STR R2, [RO] , and open the lock 
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4.6 Other Code Examples 


The following sequences illustrate some other applications of ARM assembly language. 


4.6.1. Software Interrupt dispatch 


This code segment dispatches software interrupts (SWIs) to individual handlers. 
The SWI instruction has a 24-bit field that can be used to encode specific SWI functions. 


STMFD SP!, {R12} ; Save some registers 

LDR R12, [R14, #-4] ; load the SWI instruction 

BIC R12, R12, #0xff000000; preserve the SWI number 

CMP R12, #MaximumSWI ; check the SWI number is in range, if so 
LDRLE PC, [PC, R12, LSL #2]; branch through a table to the handler 
B UnkownsSWI ; this SWI number is not supported 

DCD SWIOHandler ; address of handler for SWI 0 

DCD SWI1lHandler ; address of handler for SWI 1 

DCD SWI2Handler ; address of handler for SWI 2 





4.6.2 Single-channel DMA transfer 


The following code is an interrupt handler to perform interrupt driven lO to memory 
transfers (soft DMA). The code is especially useful as a FIQ handler, as it uses the 
banked FIQ registers to maintain state between interrupts. Therefore this code is best 
situated at location Ox1c. 


R8 points to the base address of the IO device that data is read from 

lOData is the offset from the base address to the 32-bit data register that is 
read. Reading this register disables the interrupt 

RQ points to the memory location where data is being transferred 

R10 points to the last address to transfer to 


The entire sequence to handle a normal transfer is just 4 instructions; code situated after 
the conditional return is used to signal that the transfer is complete. 


LDR rll, [r8, #10Data] ; load port data from the IO device 

STR rll, [r9], #4 ; store it to memory: update the pointer 

CMP r9, r10 ; reached the end? 

SUBLTS pc, lr, #4 ; no, so return 

; Insert transfer complete code here 
Of course, byte transfers can be made by replacing the load instructions with load byte 
instructions, and transfers from memory to and IO device are made by swapping the 
addressing modes between the load instruction and the store instruction. 
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4.6.3. Dual-channel DMA transfer 


This code is similar to the example in 4.6.2 Single-channel DMA transfer on page 4-10, 
except that it handles two channels (which may be the input and output side of the same 
channel). Again this code is especially useful as a FIQ handler, as it uses the banked 
FIQ registers to maintain state between interrupts. Therefore, this code is best situated 
at location Ox1c. 


The entire sequence to handle a normal transfer is just 9 instructions; code situated after 
the conditional return is used to signal that the transfer is complete. 


LDR r13, [r8, #10Stat] 
TST r13, #10PortlActive 
LDREQ 113, [r8, #IOPort1l] 
LDRNE r13, [r8, #IO0Port2] 
STREQ r13, [r9], #4 

STRNE 113, [r10], #4 

CMP C9. B11. 

CMPNE CLO. E12 

SUBNES pc, lr, #4 

; Insert transfer complete code here 


load status register to find .... 
. which port caused the interrupt? 
load port 1 data 
load port 2 data 
store to buffer 1 
store to buffer 2 
reached the end? 
on either channel? 
return 


Me Ne Ne Ne Ne ee Ne 


where: 

R8 points to the base address of the IO device that data is read 
from 

IOStat is the offset from the base address to a register indicating 
which of two ports caused the interrupt 

lOPort1 Active is a bit mask indicating if the first port caused the interrupt 
(otherwise it is assumed that the second port caused the 
interrupt) 

lOPort, |OPort2 are offsets to the two data registers to be read. 
Reading a data register disables the interrupt for that port 

RQ points to the memory location that data from the first port is 
being transferred to 

R10 points to the memory location that data from the second port 
is being transferred to 

R11 and R12 point to the last address to transfer to 


(R11 for the first port, R12 for the second) 


Again, byte transfers can be made by replacing the load instructions with load byte 
instructions, and transfers from memory to and IO device are made by swapping the 
addressing modes between the conditional load instructions and the conditional store 
instructions. 
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This code is intended to use the normal interrupt vector, and so should be branched to 

















holds the offset (from IntBase) of the register containing the 


is assumed to point to a small (60 byte) full descending stack 


adjust return address before saving it 


; stack return address and working register 


and stack that too 
of the highest priority active interrupt 
get interrupt controller's base address 
get the interrupt level (0 to 31) 
read the status register 
(use 0x80 for the I bit) 
write it back to re-enable interrupts 


and jump to the correct handler 
PC base address points to this 


pad so the PC indexes this table 


save working registers 


restore the working registers and the 


(use 0x80 for the I bit) 
write it back to disable interrupts 
stick the SPSR back 

restore last working register and return 


4.6.4 Interrupt prioritisation 
This code dispatches up to 32 interrupt source to their appropriate handler routines. 
from location 0x18. 
External hardware is used to prioritise the interrupt and present the number of 
the highest-priority active interrupt in an IO register. 
IntBase holds the base address of the interrupt handler 
IntLevel 
highest-priority active interrupt 
R13 
Interrupts are enabled after 10 instructions (including the branch to this code) 
; first save the critical state 
SUB ela, wid, #4 : 
STMFD r13!, {r12, r14} ; 
RS ri2, SPSR 7 get the SPSR 
STMFD r13!, {x12} i 
; now get the priority level 
OV r12, #IntBase ; 
LDR r12, [rl12, #IntLevel] ; 
; now read-modify-write the CPSR to enable interrupts 
RS rl4, CPSR ; 
BIC rl4, r14, #0x40 ; clear the F bit 
SR  CPSR, r14 ; 
; jump to the correct handler 
LDR PC, [PC, r12, LSL #2]; 
; 
; instruction + 8 
NOP ; 
; table of handler start addresses 
DCD Priority0OHandler 
DCD PrioritylHandler ........ 
Priority0OHandler 
STMFD r13!, rQ. = rrl} ; 
; insert handler code here 
LDMFD r13!, {r0 - r12} A 
SPSR 
ORR rl4, r14, #0x40 ; set the F bit 
MSR  CPSR, r14 ; 
MSR SPSR, r12 ; 
LDMFD r13!, {r12, PC}* ; 
PrioritylHandler 
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This code performs a context switch on the user mode process. The code is based 
around a list of pointers to process control blocks (PCBs) of processes that are ready to 


run. 


The pointer to the PCB of 

the next process to run is 
pointed to by R12, and the end 
of the list has a zero pointer. 


R13 is a pointer to the PCB, 
and is preserved between 
timeslices (so that on entry 
R13 points to the PCB of 

the currently running process). 


The code assumes the layout 
of the PCBs, as shown in 
Figure 4-1: PCB layout. 


























PCB pointer ——» 



































Figure 4-1: PCB layout 


STMIA £13, {x0 — 214}* ; Gump user registers above r13 

MSR r0, SPSR ; pick up the user status 

STMDB r13, {xr0, r14} ; and dump with return address below 
LDR r13, [rl12], #4 ; load next process info pointer 

CMP r13, #0 ; if it is zero, it is invalid 
iDMNEDB 113, {r0, r14} ; pick up status and return address 
MRSNE SPSR, r0O ; vestore the status 

iDMNETA r13, {r0O - r14}% ; get the rest of the registers 
MOVNES pe, r14 ; and return and restore CPSR 





; insert “no next process code” her 
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This chapter describes the differences between 32-bit and 26-bit architectures. 


5.1 
5.2 
5.3 
5.5 
5.4 
5.6 


Introduction 

Format of Register 15 

Writing just the PSR in 26-bit architectures 
Address Exceptions 

26-bit PSR Update Instructions 

Backwards Compatibility from 32-bit Architectures 
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5.1 


Introduction 


ARM architecture versions 1, 2 and 2a are earlier versions of the ARM architecture 
which implement only a 26-bit address bus, and are known as 26-bit architectures. 
ARM architecture versions 3 and 4 implement a 32-bit address space (and are known 
as 32-bit architectures). For backwards compatibility, they also implement the 26-bit 
address space (except Version 3G). Implementation of a backwards-compatible 26-bit 
address space on ARM architecture version 4 is optional. 


There are several differences between the 26-bit and the 32-bit architectures: 


Program counter 


Processor modes 


Register 15 


CPSR/SPSR 


Exceptions 


The 26-bit architectures implement only a 24-bit program 
counter in register 15, which allows 64Mbytes of program 
space. The 32-bit architectures have a 30-bit program 
counter in register 15 which allows 4Gbytes of program 
space on 32-bit architectures. 


Only four processor modes are supported on 26-bit 
architectures: 


User (Ob00) 
FIQ (0b01) 
IRQ (0b10) 
Supervisor (0b11) 


In the 26-bit architectures, the following are also stored in 
register 15: 


Four condition flags (N, Z, C and V) 
The interrupt disable flags (Il and F) 
Two processor mode bits (M1 and MO) 


The 26-bit architectures do not have a CPSR or any SPSRs. 


An exception (called an address exception) is raised if a 
memory access instruction uses an address that is greater 
than 278-1 bytes. 


Together, these differences make up the fundamental distinction between 26-bit and 


32-bit architectures: 


26-bit architectures 


32-bit architectures 
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all process status (namely the condition flags, interrupt 
status and processor mode) can be preserved across 
subroutine calls and nested exceptions without adding any 
instructions to the entry or exit sequence. 


give up this functionality to allow 32-bit instruction addresses 
to be used. 
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5.2 Format of Register 15 


31 30 29 28 27 26 25 2 1 0 


sqtelyy fF ae sales mh 


Bits[25:2] are collectively known as the Program Counter. Because the Program 
Counter occupies only 24 bits of register 15, only 2°4 instructions (8 bytes) can be 
addressed, giving a maximum addressable program size of 64Mbytes. 


Bits[31:26] and bits[1:0] are collectively known as the Program Status Register or PSR. 


The N, Z, C, V, 1, and F bits have the same meaning in both 26-bit and 32-bit 
architectures. 


M[1:0] also have the same meaning in both architectures. 
Abort mode and Undef mode are not supported in 26-bit architectures. 


Aborts and undefined instruction exceptions have exactly the same actions in both 
modes, except that in 26-bit architectures, Supervisor mode is entered instead of Abort 
or Undef mode. 


The I, F and M[1:0] bits cannot be written directly when the processor is in User mode; 
in User mode they are only changed by an exception occurring. 


5.2.1. Reading register 15 
In 26-bit architectures, the value of register 15 is read in five different ways. 


1 Most importantly, if register 15 has an unpredictable value in the 32-bit 
architecture, it also has an unpredictable value when used in the same way in 
the 26-bit architecture. 


2 If register 15 is specified in bits[19:16] of an instruction (and its value is not 
unpredictable), only the Program Counter (bits[25:2]) are used; all other bits 
read as zero. 


3 If register 15 is specified in bits[3:0] of an instruction (and its value is not 
unpredictable), all 32 bits are used. 


4 If register 15 is stored using STR or STM, the value of the program counter 
(bits[25:0]) is IMPLEMENTATION DEFINED, but all 32 bits of register are stored. 


5 All 32 bits are stored in the Link register (R14) after a Branch with Link 
instruction or an exception entry. 
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5.2.2 


5.3 


Writing register 15 


In 26-bit architectures the value of register 15, is written in three different ways: 


1 


Data-processing instructions without the S bit set, Load, and Load Multiple only 
write the PC part of register 15, and leave the PSR part unchanged. 


Data-processing instructions with the S bit set and Load Multiple with restore 
PSR write the PC and the PSR part of register 15. 


Variants of the CMP, CMN, TST and TEQ instructions write just the PSR part 
of register 15, and leave the PC part unchanged. These instruction variants are 
described below. 


These read/write rules mean that register 15 is used in three basic ways. It is used as: 


The Rn specifier in data-processing instructions, and as the base address for 
load and store instructions; only the value of the program counter is used, 

to simplify PC-relative addressing and position-independent code. 

The Rm specifier in data-processing operands to allow all process status to be 
restored after a subroutine call or exception by subroutine-return instructions 
like MOVS PC, LR and LDM..., PC}*. These instructions are unpredictable in 
user mode on 32-bit architectures, but are legal on 26-bit architectures, as they 
are used preserve the condition code values across procedure calls. 


The value saved in the Link register to preserve the Program Counter and the 
PSR across subroutine calls and exceptions. 


Writing just the PSR in 26-bit architectures 


On 26-bit architectures, the MSR and MRS instructions are not supported. Instead, 
variants of the CMP, CMN, TST and TEQ instructions are used to write just the PSR part 
of register 15. These variants are called CMPP, CMNP, TSTP and TEQP, and are 
distinguished by having instruction bits[15:12] equal to 0b1111 (these bits are usually 
set to zero for these instructions). 


These instructions write their ALU result directly to the PSR part of register 15 (only N, 
Z, C and V are affected in User mode). 
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bit PSR Update Instructions 


TST{<cond>}P Rn, <shifter_operand> 
EQ{<cond>}P Rn, <shifter_operand> 























CMP {<cond>}P Rn, <shifter_operand> 
CMN{<cond>}P Rn, <shifter_operand> 
The instruction is only executed if the condition specified in the instruction matches the 
condition code status. The conditions are defined in 3.3 The Condition Field on page 
3-4. 
The TSTP, TEQP, CMPP and CMNP are 26-bit-only instructions and are used to write 
the PSR part of register 15 without affecting the PC part of register 15. When the 
processor is in user mode, only the condition codes will be affected; all other modes 
allow all PSR bits to be altered. 
27 26 25 24 23 22 21 20 19 12 11 0 
if ConditionPassed(<cond>) then 
case <opc> of 
Ob00 /* TSTP */ 
<alu_out> = Rn AND <shifter_operand> 
Ob01 /* TEQP */ 
<alu_out> = Rn EOR <shifter_operand> 
O0b10 /* CMP */ 
<alu_out> = Rn - <shifter_operand> 
Ob11 /* CMN */ 
<alu_out> = Rn + <shifter_operand> 
endcase 
if R15[1:0] == 0b00 then /* M[1:0] == 0b00, user mode */ 
R15[31:28] = <alu_out>[31:28] /* update just NZCV */ 
else /* a privileged mode */ 
R15[31:26] = <alu_out>[31:26] /* update NZCVIF and... 
*/ 
R15[1:0] = <alu_out>[1:0] /* ... update M[1:0] */ 
None 
Condition Code 
The I bit: Bit 25 is used to distinguish the immediate and register forms of 
<shifter_operand>. See the data processing instructions for the types of shifter 
operand. 
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Address Exceptions 


On 26-bit architectures, all data addresses are checked to ensure that they are between 
0 and 64 Megabytes (26-bit). If a data address is produced with a 1 in any of the top 6 
bits, an address exception is generated. When an address exception is generated, 
the following actions are performed: 


R14_svc = address of address exception generating instruction + 4 
CPSR[4:0] = O0b00011 ; Supervisor mode 

CPSR[7] = 1 ; (Normal) Interrupts disabled 

PC = 0x14 


The address of the instruction which caused the address exception is the value in 
register 14 minus 8. 


Returning from an address exception 
As this exception implies a programming error, it is not usual to return from address 
exceptions, but if a return is required, use: 

SUBS PC,R14, #8 


This restores both the PC and PSR (from R14_svc) and returns to the instruction that 
generated the address exception. 
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5.6 Backwards Compatibility from 32-bit Architectures 


As well as the six (Seven in Version 4) 32-bit processor modes, ARM Architecture 
Version 3 (but not 3G), and 4 (optionally on 4T) implement the four 26-bit processor 
modes described above, including the register 15 format shown. This allows backwards 
compatibility for the older 26-bit programs by executing those programs in a 26-bit 
mode. If the backwards-compatibility support is not implemented, CPSR bit 4 (M[4]) 
always reads as 1, and all writes are ignored. 


The complete list of processor modes is shown in Table 5-1: 32-bit and 26-bit modes. 








M[4:0] Mode Accessible Registers 

0b00000 | User_26 RO to R14, PC, (CPSR) 

0b00001 | FIQ_26 RO to R7, R8_fiq to R14_fig, PC, (CPSR, SPSR_fiq) 

0b00010 | IRQ_26 RO to R12, R13_irq, R14_irg, PC, (CPSR, SPSR_irq) 

0b00011 | SVC_26 RO to R12, R13_svc, R14_svc, PC, (CPSR, SPSR_svc) 

0b10000 | User_32 RO to R14, PC, CPSR 

0b10001 | FIQ_32 RO to R7, R8_fiq to R14 _fig, PC, CPSR, SPSR_fig 

0b10010 | IRQ_32 RO to R12, R13_irg, R14_irg, PC, CPSR, SPSR_irq 

0b10011 | SVC_32 RO to R12, R13_ svc, R14 svc, PC, CPSR, SPSR_sve 

0b10111 | Abort_32 RO to R12, R13_abt, R14_abt, PC, CPSR, SPSR_abt 

0b11011 | Undef_32 RO to R12, R13_und, R14_und, PC, CPSR, SPSR_und 

0b11111 | System_32 | RO to R14, PC, CPSR, (Architecture Version 4 only) 

Table 5-1: 32-bit and 26-bit modes 
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5.6.1 


32-bit and 26-bit configuration 


ARM Architecture Version 3, 3M and 4 (but not 3G or 4T) optionally incorporate two 
signals that control 32-bit instruction accesses and 32 bit data accesses. The signals are 
mapped to two bits in register 1 of the system control coprocessor. These signals are: 


PROG32 
DATA32 


32-bit configuration 


1 


If PROG32 is active, the processor switches to a 32-bit mode when processing 
exceptions (including Reset), using the _32 modes for handling all exceptions. 
This is called a 32-bit configuration. Abort_32 mode is used for handling 
memory aborts, and Undef_32 for handling undefined instruction exceptions. 
A 26-bit mode can be selected by putting a 26-bit mode number into the M[4:0] 
bits of the CPSR (either using MSR or an exception return sequence). A 32-bit 
mode can also be entered from a 26-bit mode using the MSR instruction. Once 
in a 26-bit mode, another 26-bit mode can be entered using one of the TEQP, 
TSTP, CMPP and CMNP instructions, or the MSR instruction. 

If an exception occurs when the processor is in a 26-bit mode, only the PC bits 
from R15[25:2] are copied to the link register; the remaining bits in the link 
register are zeroed. The PSR bits from R15[31:26] and R15[1:0] are copied 
into the SPSR, ready for a normal 32-bit return sequence. 


If PROG32 is active, and DATA32 is not active (32-bit programs with 26-bit 
data), the result is unpredictable. 


26-bit configuration 


1 


If PROG32 is not active, the processor is locked into 26-bit modes (cannot be 
placed into a 32-bit mode by any means) and handles exceptions in 26-bit 

modes. This is called a 26-bit configuration. In this configuration, TEQP, TSTP, 
CMPP and CMNP instructions, or the MSR instruction can be used to switch 
to 26-bit mode. Attempts to write CPSR bits[4:2] (M[4:2]) are ignored, stopping 
any attempts to switch to a 32-bit mode, and SVC_26 mode is used to handle 
memory aborts and undefined instruction exceptions. The program counter is 
limited to 24 bits, limiting the addressable program memory to 64 Megabytes. 


If PROG32 is not active, DATA32 has the following actions. 


a) If DATA32 is not active , all data addresses are checked to ensure 
that they are between 0 and 64 Megabytes (26 bit). If a data address 
is produced with a 1 in any of the top 6 bits, an address exception is 
generated. 


b) If DATA32 is active, full 32-bit addresses may be produced and are 
not checked for address exceptions. This allows 26-bit programs to 
access 32-bit data. 
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When the processor is in a 32-bit configuration (PROG32 is active) and in a 26-bit mode 
(CPSR[4] == 0), data access (but not instruction fetches) to the hard vectors 
(address 0x0 to 0x1f) cause a data abort, known as a vector exception. 


Vector exceptions are always produced if the hard vectors are written in a 32-bit 
configuration and a 26-bit mode, and it is IMPLEMENTATION DEFINED whether reading 
the hard vectors in a 32-bit configuration and a 26-bit mode also causes a vector 
exception. 


Vector exceptions are provided to support 26-bit backwards compatibility. 

When a vector exception is generated, it indicates that a 26-bit mode process is trying 
to install a (26-bit) vector handler. Because the processor is in a 32-bit configuration, 
exceptions will be handled in a 32-bit mode, so a veneer must be used to change from 
the 32-bit exception mode to a 26-bit mode before calling the 26-bit exception handler. 
This veneer may be installed on each vector, and can switch to a 26-bit mode before 
calling any 26-bit handlers. 


The 26-bit exception handler’s return may also need to be veneered. Some SWI 
handlers return status information in the processor flags, and this information will need 
to be transferred from the link register to the SPSR with a return veneer for the SWI 
handler. 
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The Thumb instruction set is a subset of the ARM instruction set. 


The Thumb Instruction Set 


Thumb is designed to increase the performance of ARM implementations that use 


a memory data bus, and to allow better code density than ARM. The ARMv4T 


architecture incorporates both a full 32-bit ARM instruction set and the 16-bit Thumb 


instruction set. Every Thumb instruction can be encoded in 16 bits. 


This chapter lists every Thumb instruction, and gives information on its format and 


encoding. 
6.1 
6.2 
6.3 
6.4 
6.5 
6.6 
6.7 
6.8 
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The Thumb Instruction Set 
6.1 Using this Chapter 


This chapter is divided into three parts: 

1 Introduction to Thumb 

2 Overview of the Thumb instruction types 
3 Alphabetical list of instructions 


6.1.1 Introduction to Thumb (page 6-3 through 6-4) 


This part describes the Thumb concepts and how it fits in with ARM instruction 
execution. 


6.1.2 Overview of the Thumb instruction types (page 6-5 through 6-15) 


This part describes the functional groups within the Thumb instruction set, and shows 
relevant examples and encodings. Each functional group lists all its instructions, which 
you can then find in the alphabetical section. The functional groups are: 


1. Branch Instructions 

2 ___Data-processing Instructions 

3 Load and Store Register Instructions 
4 Load and Store Multiple Instructions 


6.1.3 Alphabetical list of instructions (page 6-19 through 6-82) 


This part lists every Thumb instruction in alphabetical order, and gives: 
* — instruction syntax and functional group 
* encoding and operation 
* relevant exceptions and qualifiers 
* notes on usage 


Where relevant, the instruction descriptions show the equivalent ARM instruction and 
encoding. 
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6.2 Introduction to Thumb 


Thumb does not alter the underlying structure of the ARM architecture; it merely 
presents restricted access to the ARM architecture. All Thumb data-processing 
instructions operate on full 32-bit values, and full 32-bit bit addresses are produced by 
both data-access instructions and for instruction fetches. 


When the processor is executing Thumb, eight general-purpose integer registers are 

available, RO to R7, which are the same physical registers as RO to R7 when executing 
ARM. Some Thumb instructions also access the Program Counter (ARM Register 15), 
the Link Register (ARM Register 14) and the Stack Pointer (ARM Register 13). Further 
instructions allow limited access to ARM registers 8 to 15 (known as the high registers). 


When R15 is read, bit[0] is zero and bits [31:1] contain the PC. When R15 is written, 
bit[O] is IGNORED and bits[31:1] are written to the PC. Depending on how it is used, 
the value of the PC is either the address of the instruction plus 4 or is UNPREDICTABLE. 


Thumb does not provide direct access to the CPSR or any SPSR (as in the ARM MSR 
and MRS instructions). Thumb execution is flagged by the T bit (bit 5) in the CPSR: 


T == 32-bit instructions are fetched (and the PC is incremented by 4) and 
are executed as ARM instructions 
T == 16-bit instructions are fetched from memory (and the PC is 


incremented by two) and are executed as Thumb instructions 


6.2.1 Entering Thumb state 


Thumb execution is normally entered by executing an ARM BX instruction (Branch and 
eXchange instruction set). This instruction branches to the address held ina 
general-purpose register, and if bit[O] of that register is 1, Thumb execution begins at 
the branch target address. If bit[O] of the target register is 0, ARM execution continues 
from the branch target address. 


Thumb execution can also be initiated by setting the T bit in the SPSR and executing 
an ARM instruction, which restores the CPSR from the SPSR (a data-processing 
instruction with the S bit set and the PC as the destination, or a Load Multiple and 
Restore CPSR instruction). This allows an operating system to automatically restart 
a process independently of whether that process is executing Thumb code or ARM 
code. 


The result is UNPREDICTABLE if the T bit is altered directly by writing the CPSR. 


6.2.2 Exceptions 


Exceptions generated during Thumb execution switch to ARM execution before 
executing the exception handler (whose first instruction is at the hardware vector). 
The state of the T bit is preserved in the SPSR, and the LR of the exception mode is 
set so that the same return instruction performs correctly, regardless of whether 

the exception occurred during ARM or Thumb execution. 


Table 6-1: Exception return instructions lists the values of the exception mode LR for 
exceptions generated during Thumb execution. 
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Exception Exception Link Register Value Return Instruction 
Reset Unpredictable value MOVS PC, R14 
Undefined Address of Undefined instruction + 2 MOVS PC, R14 

SWI Address of SWI instruction + 2 MOVS PC, R14 
Prefetch Abort | Address of aborted instruction fetch + 4 SUBS PC, R14, #4 
Data Abort Address of the instruction that generated the abort + 8 | SUBS PC, R14, #8 

IRQ Address of the next instruction to be executed + 4 SUBS PC, R14, #4 

FIQ Address of the next instruction to be executed + 4 SUBS PC, R14, #4 

Table 6-1: Exception return instructions 
6.3 Instruction Set Overview 


Shift by immediate 

Add/Subtract register 
Add/Subtract immediate 
Add/Subtract/Compare immediate 
Data-processing register 

Special data processing 

Load from literal pool 

Load/Store Word/Byte Register 
Load/Store Signed Byte/Halfword Register 
Load/Store Word/Byte Immediate 
Load/Store Halfword immediate 
Load/Store to/from stack 
Add/Subtract to/from SP or PC 
Adjust stack pointer 

Push/Pop register list 

Load/Store Multiple 

Conditional branch 

Software interrupt 

Unconditional branch 

Undefined instruction 

BL prefix 

BL 


immediate 





op Rm 





op immediate 














Rd|Rn immediate 


opcode 








opcode | H1 








Rd PC-relative offset 





0 Rm Rn Rd 





1 Rm Rn Rd 














immediate Rn Rd 





immediate Rn Rd 











-relative offset 





immediate 





immediate 





register_list 














register_list 


offset 





SWI number 








offset 








offset 











offset 


Figure 6-1: The Thumb instruction set (expanded) 
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The Thumb Instruction Set 
6.4 Branch Instructions 


Thumb supports four types of branch instruction. 


*  anunconditional branch that allows a forward or backward branch of up to 
2 Kbytes 


* —aconditional branch to allow forward and backward branches of up to 
256 bytes 


* abranch and link (subroutine call) is supported with a pair of instructions that 
allow forward and backwards branches of up to 4 Mbytes 


* abranch and exchange instruction branches to an address in a register and 
optionally switches to ARM code execution. 


6.4.1 Encoding 


The encoding for these formats is given below: 


Format 1 
B<cond> <target_address> 
15 14 13 12 11 8 7 0 
1 1 0 1 8_bit_signed_offset 
Format 2 
B <target_address> 
15 14 13 12 11 10 0 
1 1 1 0 0 11_bit_signed_offset 
Format 3 
BL <target_address> 
15 14 13 12 11 10 0 
1 1 1 1 11_bit_signed_offset 
Format 4 
BX Rm 
1514 13 12 11 10 9 8 7 6 5 3 2 0 


ol 
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6.4.2 


func 


6.4.3 


6-6 


Examples 


B 
BCC 
BEQ 





BL 


MOV 


BX 


List of branch instructions 


label 
label 
label 


func 


PCy Li 


R12 


R 


’ 
, 


’ 


unconditionally 
branch to label 
branch to label 


subroutine call 


R15=R14, return 


branch to label 
if carry flags is clear 
if zero flag is set 


to function 


to instruction after the BL 


branch to address in R12; begin Thumb execution if 


bit 0 of R12 is 
Thumb code 


zero; otherwise continue executing 


The following instructions follow the formats shown above. 


B 
B 
BL 
BX 


Conditional branch page 6-30 
Unconditional branch page 6-31 
Branch with Link page 6-33 
Branch and exchange instruction set page 6-34 
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6.5 Data-processing Instructions 


The Thumb Instruction Set 


Thumb data-processing instructions are a subset of the ARM data-processing 
instructions, as shown below. 


All Thumb data-processing instructions set the condition codes. 


Mnemonic 


MOV 
MVN 
ADD 
ADD 
ADD 
ADC 
SUB 
SUB 
SUB 
SBC 
N 

A 





isa 
{e) 
ve) 


n 
H 





S&S Bee © B@e © Bue O 
Ss 
las) 





ala ¢ 





LSR 
LSR 
ASR 
ASR 
ROR 
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Rd, 
Rd, 
Rd, 
Rd, 
Rd, 
Rd, 
Rd, 
Rd, 
Rd, 
Rd, 
Rd, 
Rd, 
Rd, 
Rd, 
Rd, 
Rn, 
IRIAL 
Rn, 
Rn, 
Rd, 


n IRL, 
, Rd, 


Rd, 
Rd, 
Rd, 
Rd, 
Rd, 


@ te 255 
Rm 

Rn, Rm 
Rn, #0 to 
0 to 255 
Rm 

Rn, Rm 
Rn, #0 to 
0 te 25s 
Rm 





Rm 
Rm 
Rm 
Rm 
Rm 
#0 to 255 
Rm 
Rm 
Rm 
RS 
Rm, #0 to 
RS 
Rm, #0 to 
Rs 
Rm, #0 to 
Rs 
Rs 


Sul 


Sul 


Sil 





Operation 

Move 

Move Not 

Add 

Add 

Add 

Add with Carry 
Subtract 

Subtract 

Subtract 

Subtract with Carry 
Negate 

Logical AND 

Logical Exclusive OR 
Logical (inclusive) OR 
Bit Clear 

Compare 

Compare 

Compare Negated 
Test 

Multiply 

Logical Shift Left 
Logical Shift Left 
Logical Shift Right 
Logical Shift Right 
Arithmetic Shift Right 
Arithmetic Shift Right 
Rotate Right 





Action 

Rd := 8-bit immediate 
Rd := NOT Rm 

Rd := Rn+ Rm 


Rd := Rn + 3-bit immediate 

Rd := Rd + 8-bit immediate 

Rd := Rd + Rm + Carry Flag 

Rd := Rn- Rm 

Rd := Rn - 3-bit immediate 

Rd := Rd - 8-bit immediate 

Rd := Rd - Rm - NOT(Carry Flag) 
Rd :=0-Rm 

Rd := Rd AND Rm 

Rd := Rd EOR Rm 

Rd := Rd OR Rm 

Rd := Rd AND NOT Rm 

update flags after Rn - 8-bit immediate 
update flags after Rn - Rm 
update flags after Rn + Rm 
update flags after Rn AND Rm 


Rd := Rs x Rd 
Rd := Rm LSL 5-bit immediate 
Rd := Rd LSL Rs 


Rd := Rm LSR 5-bit immediate 
Rd := Rd LSR Rs 
Rd := Rm ASR 5-bit immediate 
Rd := Rd ASR Rs 
Rd := Rd ROR Rs 


Table 6-2: Thumb data-processing instructions 
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Examples 


6.5.1 High registers 


ADD 
SUB 
ADD 
A 








RO, 
R6, 
RO, 
Rl, 
R3, 
R2, 
Rl, 
R2, 
R7, 
RO, 


R4, 
Rl, 
#255 
R4, 
R1 
R5 
R6 
R3 
#100 
#200 


R7 ; RO = R4 + 
R2 ; R6 = R1 - 

; RO = RO + 
#4 ; RI = R4 + 





R7 
R2 
255 
4 


7; R3 = 0 - RI 

; R2 = R2 AND RD 

; Rl = R1 EOR R6 

; update flags after R2 - R3 








; update flags after R7 - 100 


; RO = 200 


There are seven types of data-processing instruction which operate on ARM registers 
8 to 14 and the PC (called the high registers). 


Mnemonic 


MOV 


AD 


CME 


AD 


SUB 


AD 
AD 


D 


DPD 


D 


D 





D 


Rd, 
Rd, 
IRual 
SP, 
Sey 
Rd, 
Rd, 


Rn 


Examples 


6-8 


MOV 
DD 
OV 

P 
UB 
DD 
DD 


> PP nNnasg pep 





DD 


#0 
#0 
#0 
#0 


to 
ic 
to 
ico 


RO, 


Olt 
Lal 
1020 
OAG 


Operation 

Move 

Add 

Compare 

Increment stack pointer 
Decrement stack pointer 
Form Stack address 
Form PC address 








Action 
Rd := Rn 
Rd := Rd + Rm 


update flags after Rn - Rm 
R13 = R13 + #9-bit immediate 
R13 = R13 - #9-bit immediate 
Rd = R13 + #10-bit immediate 
Rd = PC + #10-bit immediate 


Table 6-3: High register data-processing instructions 


R10, R1, 


PC, 


R10, R11 


SP, 
SP, 
R2, 
RO, 
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; RO = R12 
R2 ; R6 = R1 - R2 

; PC = R14 

; update flags after R10 - R11 
#10 ; increase stack size by 100 bytes 
#16 ; decrease stack size by 16 bytes 
#20 ; R2 = SP + 20 





#500 7 RO = PE: + 


500 
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Data-processing instructions perform an operation on the general processor registers: 


Format 1 
<opcodel> 
<opcodel> 


:= ADD | 


Rd, Rn, Rm 
SUB 


15 14 13 12 11 10 
0 0 1 1 


9 8 6 
Ee ) 


Format 2 
<opcode2> 
<opcode2> 


15 14 13 12 


:= ADD | 


Rd, Rn, <3_bit_immed> 


SUB 


11 10 9 8 


ae cee pom fm | 


5 3 2 0 
Rn 








Format 3 
<opcode3> Rd|Rn, #<8_bit_immed> 
<opcode3> := ADD | SUB | MOV | CMP 
15 14 13 12 11 10 8 7 0 
ial Geena 


Format 4 
<opcode4> 
<opcode4> 


15 14 13 12 


:= LSL | 


Rd, Rn #<shift_imm> 
LSR | ASR 


10 6 


2 0 


11 5 3 
shift_immediate 


Format 5 
<opcode5> 
<opcode5> 


15 14 13 12 14 10 
1 0 0 0 0 


Rd | Rn, Rm | Rs 


:= MVN|CMP|CMN|TST|ADC| 


LSL|LSR|ASR|ROR|AND |! 





9 6 
op_5 


SBC | NEG|MUL| 
EOR|ORR|BIC 





5 3 2 0 
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Format 6 
ADD Rd, <reg>, #<8_bit_immed> 
<reg> := SP | PC 
15 14 13 12 11 10 8 7 0 
1 0 1 0 reg pre 8 _bit_immediate 
Format 7 
<opcode6> SP, SP, #<7_bit_immed> 
<opcode6> := ADD | SUB 





1 0 1 1 0 0 0 0 es 7_bit_immediate 


6.5.3 List of data-processing instructions 


The following instructions follow the formats shown above. 














ADC Add with Carry (register) page 6-19 
ADD Add (immediate) page 6-20 
ADD Add (large immediate) page 6-21 
ADD Add (register) page 6-22 
ADD Add high registers page 6-23 
ADD Add (immediate to program counter) page 6-24 
ADD Add (immediate to stack pointer) page 6-25 
ADD Increment stack pointer page 6-26 
AND Logical AND page 6-27 
ASR Arithmetic shift right (immediate) page 6-28 
ASR Arithmetic shift right (register) page 6-29 
BIC Bit clear page 6-32 
CMN Compare negative (register) page 6-35 
CMP Compare (immediate) page 6-36 
CMP Compare (register) page 6-37 
CMP Compare high registers page 6-38 
EOR Exclusive OR page 6-39 

SL Logical shift left (immediate) page 6-52 

SL Logical shift left (register) page 6-53 
LSR Logical shift right (immediate) page 6-54 
LSR Logical shift right (register) page 6-55 
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Move (immediate) 

Move high registers 
Multiply 

Move NOT (register) 
Negate (register) 

Logical OR 

Rotate right (register) 
Subtract with Carry (register) 
Subtract (immediate) 
Subtract (large immediate) 
Subtract (register) 
Decrement stack pointer 
Test (register) 
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page 6-56 
page 6-57 
page 6-58 
page 6-59 
page 6-60 
page 6-61 
page 6-66 
page 6-67 
page 6-77 
page 6-78 
page 6-79 
page 6-80 
page 6-82 


The Thumb Instruction Set 


Load and Store Register Instructions 


6.6 


6.6.1 


6-12 


Formats 


Thumb supports 8 types of load and store register instructions. Two basic addressing 
modes are available: 

* — register plus register 

* — register plus 5-bit immediate 
If an immediate offset is used, it is scaled by 4 for word access and 2 for halfword 
accesses. Three special instructions allow a load using the PC as a base with a 1 Kbyte 


(word-aligned) immediate offset, and a load and store instructions with the stack pointer 
(R13) as the base and a 1Kbyte (word aligned) immediate offset. 


Load and Store instructions perform an operation on the general processor registers. 
Load and store instructions have the following formats: 





Format 1 
<opcodel> Rd, [Rn, #<5_bit_offset>] 
<opcodel> := LDR|LDRH|LDRB|STR|STRH|STRB 
15 i 10 6 5 3 2 0 
rs pom fm 
Format 2 
<opcode2> Rd, [Rn, Rm] 
<opcode2> := LDR|LDRH|LDRSH|LDRB|LDRSB|STR|STRH|STRB 
15 9 8 6 5 3 2 0 


LDR Rd, [PC, #<8_bit_offset>] 


uaa pom fmf om 
Format 3 
0 


15 14 13 12 11 10 8 7 
0 1 0 0 1 pra 8 _bit_immediate 
Format 4 
<opcode3> Rd, [SP, #<8_bit_offset>] 
<opcode3> := LDR | STR 
15 14 13 12 11 10 8 


N 
Oo 


Hoo 0 [ul ome 8 bit immediate 
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6.6.2 Examples 




















LDR R4, [R2, 4 ; Load word into R4 from address R2 + 4 
LDR R4, [R2, Rl ; Load word into R4 from address R2 + R1 
STR RO, R7, Ox7c] ; Store word from RO to address R7 + 124 
STRB Rl, [R5, #31] ; Store byte from Rl to address R5 + 31 
STRH R4, R2, R3 ; Store halfword from R4 to R2 + R3 
LDRSB R5, [RO, #0 ; Load signed byte into R5 byte from RO 
LDRSH RI, [R2, 10] ; Load signed halfword to R1 from R2 + 10 
LDRH R3, [R6, R5 ; Load word into R3 from R6 + RD5 

LDRB R2, [Rl, 5 ; Load byte into R2 from R1 + 5 

LDR R6, [PC, Ox3fc]; Load R6 from PC + Ox3fc 

LDR R5, SP, 64] ; Load R5 from SP + 64 

STR R4, SP, 0x260]; Load R5 from SP + 0x260 























6.6.3 List of load and store register instructions 


The following instructions follow the formats shown above. 








LDR Load word (immediate offset) page 6-42 
LDR Load word (register offset) page 6-43 
LDR Load word (PC-relative) page 6-44 
LDR Load word (SP-relative) page 6-45 
LDRB Load unsigned byte (immediate offset) page 6-46 
LDRB Load unsigned byte (register offset) page 6-47 
LDRH Load unsigned halfword (immediate offset) page 6-48 
LDRH Load unsigned halfword (register offset) page 6-49 
LDRSB Load signed byte (register offset) page 6-50 
LDRSH Load signed halfword (register offset) page 6-51 
STR Store word (immediate offset) page 6-70 
STR Store word (register offset) page 6-71 
STR Store word (SP-relative) page 6-72 
STRB Store byte (immediate offset) page 6-73 
STRB Store byte (register offset) page 6-74 
STRH Store halfword (immediate offset) page 6-75 
STRH Store halfword (register offset) page 6-76 
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Load and Store Multiple Instructions 


6.7 


6.7.1 


6.7.2 


6-14 


Formats 


Examples 


Thumb supports four types of load and store multiple instructions. 


Two (a load and a store) are designed to support block copy; they have a fixed 
increment-after addressing mode from a base register. 


The other two instructions (called PUSH and POP) also have a fixed addressing mode. 
They implement a full descending stack, and the stack pointer (R13) is used as the 
base register. 


All four instructions can transfer any or all of the lower 8 registers. PUSH can also stack 
the return address and POP can load the PC. All four instructions update the base 
register after the transfer. 


Format 1 
<opcodel> Rn!, <register_list> 
<opcodel> := LDMIA | STMIA 
15 14 13 12 11 10 8 7 0 
Format 2 


POP {<register_list>,<PC>} 
PUSH {<register_list>,<LR>} 


15 14 13 12 11 10 9 8 it 0 
1 0 1 1 1 1 pupa register_list 
LDMIA R7!, {RO - R3, R5} ; Load RO to R3 and R5 from R7 
; then add 20 to R7 
STMIA RO!, {R3, R4, R5} ; Store R3-R5 to RO: add 12 to RO function 
STMFD R13!, {RO-R7, LR} ; save all regs and return address 


; code of the function body 


STMFD R13!, {RO-R7, PC} ; restore all register and return 


PUSH {RO - R7, LR} ; push onto the stack (R13) RO - R7 and 
; the return address 
POP {RO - R7, PC} ; restore RO - R7 from the stack 


, and the program counter, and return 
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6.7.3 List of load and store multiple instructions 


The following instructions follow the formats shown above. 


LDM Load multiple page 6-40 
POP Pop multiple page 6-62 
PUSH Push multiple page 6-64 
STM Store multiple page 6-68 
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Instruction name =———_—_—__ Description ———_____— Syntax 
given in the following alphabetical list 
fe BU) | y 
A = 
os Function Ss. | = B a) B<cond> <target_address> 
short description of the instruction Pe - 
‘ vaceas conational branch Description This form of the B (Branch) instruction provides condit 
Architecture availability tr Architecture var ony 


'———+* In this case, B causes a conditional branch to a target a 
by shifting the 8-bit signed offset left by one bit, sign-e: 
the contents of the PC (which contains the address of tt 
therefore specify a branch of +/- 256 bytes. 


Thumb instructions are avalable 
in Architecture v4 only 


The instruction is only executed if the condition specific 
status. The conditions are defined in 3.2 The Conditior 























Encoding > 15 14 13 12 Ww 8 x 
specifies the bit patterns for the instruction 1 1 0 1 cond 
: ‘ f Pa) Operation = Operation if ConditionPassed(<cond>) then 
describes the operation of the instruction in pseudo-code PC = PC + (SignExtend(<8_bit_s 
Exceptions > Exceptions None 

lists any possible exceptions Qualifiers Condition Code 
Net Notes Offset calculation: An assembler will calculate the branc 
rH f address of the current instruction and the addre: 
Qualifiers and flag settings —— | four (because the PC holds the address of the « 


lists any conditions and flag settings 


that apply to the instruction Memory bounds: Branching backwards past location ze 


address space is UNPREDICTABLE. 


User notes —— | 


gives notes on using the instruction 


Equivalent ARM instruction > Equivalent ARM ; 
gives the syntax and encoding quivatent Syntax andencoding 


for the equivalent ARM instruction B<cond> <target_address> 





34 2g e768 





cond es ea ee sign extension of 8_bit_signed_offset 
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ADC Rd, Rm | ADC FE 





Description The ADC (Add with Carry) instruction is used to synthesize 64-bit addition. sere te 
If register pairs RO,R1 and R2,R3 hold 64-bit values (RO and R2 hold 9 
the least-significant word), the following instructions leave the 64-bit sum in 
RO,R1: Architecture v4T only 
ADD RO, R2 
ADC R1,R3 
The instruction ADC RO, RO produces a single-bit Rotate Left with Extend 
operation (33-bit rotate though the carry flag) on RO. 
ADC adds the value of register Rd, and the value of the Carry flag, and the value 
of register Rm, and stores the result in register Rd. The condition code flags are 
updated (based on the result). 
15 14 13 12 11 10 9 8 7 6 5 3 2 0 
Operation Rd = Rd + Rm + C Flag 
N Flag = Rd[31] 
Z Flag = if Rd == 0 then 1 else 0 
C Flag = CarryFrom(Rd + Rm + C Flag) 
V Flag = OverflowFrom(Rd + Rm + C Flag) 
Exceptions None 
Qualifiers None 
Equivalent ARM syntax and encoding 
ADCS Rd, Rd, Rm 
31 30 29 28 27 26 25 24 23 22 21 20 19 16 15 12 11:10 9 
Pees] ew [oe Perse] a 
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ADD (1) 


Add immediate 


Thumb 


Architecture v4T only 


ADD Rd, Rn, #<3_bit_immediate> 


Description This form of the ADD instruction adds a small constant value to the value of 
a register and stores the result in a second register. 


In this case, ADD adds the value of register Rn and the value of the 3-bit immediate 
(values 0 to 7), and stores the result in the destination register Rd. The condition 
code flags are updated (based on the result). 





Operation = Rn + <3_bit_immediate> 
N ne = Rd[31] 
Z Flag = if Rd == 0 then 1 else 0 





C Flag = CarryFrom(Rn + <3_bit_immed>) 
V Flag = OverflowFrom(Rn + <3_bit_immed>) 


Exceptions None 


Qualifiers None 


Equivalent ARM syntax and encoding 
ADDS Rd, Rn, #<3_bit_immediate> 


31 30 29 28 27 26 25 24 23 22 21 20 19 16 15 12:11 10 9 
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ADD Rd, #<8_bit_immediate> y:\)D) (4) 


Description This form of the ADD instruction is also used to add a large constant value to Add large immediate 
the value of a register and to store the result back in the same register. 


In this case, ADD instruction adds the value of register Rd and the value of the 8-bit Archi AT onk 
immediate (values 0 to 255), and stores the result back in register Rd. PEACE GY 
The condition code flags are updated (based on the result). 





15 14 13 12 11 10 9 8 7 0 
Operation Rd = Rd + <8_bit_immed> 
N Flag = Rd[31] 
Z Flag = if Rd == 0 then 1 else 0 





C Flag = CarryFrom(Rd + <8_bit_immed>) 
V Flag = OverflowFrom(Rd + <8_bit_immed>) 


Exceptions None 


Qualifiers None 


Equivalent ARM syntax and encoding 
ADDS Rd, Rd, #<8_bit_immediate> 


31.30 29 28 27 26 25 24 23 22 21 20 19 16 15 12,11 :10 9 
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ADD (3) 


Add register 


Architecture v4T only 


ADD Rd, Rn, Rm 


Description This form of the ADD instruction adds the value of one register to the value of 
a second register, and stores the result in a third register. 


In this case, ADD adds the value of register Rn and the value of register Rm, 
and stores the result in the destination register Rd. The condition code flags are 
updated (based on the result). 


15 14 13 12 11 10 9 8 6 5 3 2 0 
Operation Rd = Rn + Rm 
N Flag = Rd[31] 
Z Flag = if Rd == 0 then 1 else 0 





C Flag = CarryFrom(Rn + Rm) 
V Flag = OverflowFrom(Rn + Rm) 


Exceptions None 


Qualifiers None 


Equivalent ARM syntax and encoding 
ADDS Rd, Rn, Rm 


31 30 29 28 27 26 25 24 23 22 21 20 19 16 15 12:11 10 9 
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aieea ae ADD (4) 


Description This form of the ADD instruction is used for addition of values in the high registers. Aad high registers 
In this case, ADD: 
« adds the value of a low register to a high register (H1=1, H2=0), or 
« adds the value of a high register to a low register (H1=0, H2=1), or 
* adds the value of a high register to another high register (H1=1, H2=1) 
The condition code flags are not affected. 





Architecture v4T only 


15 14 13 12 11 10 9 8 7 6 5 3 2 0 
Operation Rd = Rd + Rm 


Exceptions None 
Qualifiers None 


Notes Operand restriction: If a low register is specified for Rd and Rm (H1=0 and H2=0), 
the result is UNPREDICTABLE. 


Equivalent ARM syntax and encoding 
ADD Rd, Rd, Rm 


31.30 29 28 27 26 25 24 23 22 21 20 19 18 16.15 14 12,1110 9 
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Thumb 


ADD (5) 


Add immediate 
to program counter 


Architecture v4T only 


ADD Rd, PC, #<8_bit_immediate>) 


Description This form of the ADD instruction is used to address a PC-relative (word-sized) 
variable. 


In this case, ADD clears the bottom two bits of the value of the PC and adds 
the result to the value of the 8-bit immediate (values 0 to 255), and stores the result 
in register Rd. 


15 14 13 12 11 10 8 7 0 
Operation Rd = (PC AND Oxfffffffc) + <8_bit_immed> 


Exceptions None 


Qualifiers None 


Equivalent ARM syntax and encoding 
ADD Rd, PC, #<8_bit_immediate> 


31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 12:11 10 9 
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ADD Rd, SP, #<8_bit_immediate> y.\b)D) (2) 


Description This form of the ADD instruction is used to address an SP-relative (word-sized) ae Dae 
variable. ‘o stack pointer 





In this case, ADD adds the value of the SP and the value of the 8-bit immediate 


(values 0 to 255), and stores the result in register Rd. Pee aan 
15 14 13 12 if 10 8 Z 2 
Operation Rd = SP + <8_bit_immed> 


Exceptions None 


Qualifiers None 


Equivalent ARM syntax and encoding 
ADD Rd, SP, #<8_bit_immediate> 


31.30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 12,1110 9 
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ADD (7) 


Increment 
stack pointer 


Architecture v4T only 


ADD SP, SP, #<7_bit_immediate> 


Description This form of the ADD instruction is used to decrease the size of the stack. 


In this case, ADD adds the value of the SP and the value of the 7-bit immediate 
(values 0 to 127), and stores the result back in the SP. 





1 0 1 1 0 0 0 0 0 7_bit_immediate 


Operation SP = SP + <7_bit_immed> 


Exceptions None 


Qualifiers None 


Equivalent ARM syntax and encoding 


ADD SP, SP, #<7_bit_immediate> 


31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 
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AND Rd, Rm a a 


Description The AND (Logical AND) instruction is most useful for extracting a field from Logical AND 
a register, by ANDing the register with a mask value that has 1’s in the field to be 
extracted, and 0’s elsewhere. 


AND performs a bitwise AND of the value of register Rm with the value of register “//"tecture v4T only 
Rd, and stores the result back in register Rd. The condition code flags are updated 
(based on the result). 





15 14 13 12 1 10 9 8 7 6 5 3 2 0 
Operation Rd = Rd AND Rm 
N Flag = Rd[31] 
Z Flag = if Rd == 0 then 1 else 0 
C Flag = <shifter_carry_out> 
V Flag = unaffected 


Exceptions None 


Qualifiers None 


Equivalent ARM syntax and encoding 
ANDS Rd, Rd, Rm 


3130 29 28 27 26 25 24 23 22 21 20 19 16 15 2 AA: “AD 2.59) 
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= ASR(1) ASR Rd, Rm, #<shift_imm> 


Arithmetic shift right 


; ; The ASR (Arithmetic Shift Right) instruction is used to provide the signed value of 
(immediate) 


a register divided by a constant power of 2. 


In this case, ASR performs an arithmetic shift right of the value of register Rm by 
an immediate value in the range 1 to 32, and stores the result into the destination 
register Rd. The sign bit of Rm (Rm[81]) is inserted into the vacated bit positions. 
A shift by 32 is encoded by: 


<shift_imm> = 0 


Architecture v4T only 


The condition code flags are updated (based on the result). 


15 14 13 12 11 10 6 5 3 2 0 
Operation if <shift_imm> == 0 
if Rd[31] == 0 then 
C Flag = Rd[31] 
Rd = 0 
else /* Rd[31] == 1 */ 


C Flag = Rm[31] 
Rd = OxfffffffFf 
else /* <shift_imm> > 0 */ 
C Flag = Rd[<shift_imm> - 1] 
Rd = Rd Arithmetic_Shift_Right <shift_imm> 
N Flag = Rd[31] 
Z Flag = if Rd == 0 then 1 else 0 
V Flag = unaffected 





Exceptions None 


Qualifiers None 


Equivalent ARM syntax and encoding 
MOVS Rd, Rm, ASR #<shift_imm> 


31 30 29 28 27 26 25 24 23 22 21 20 19 16 15 12 11 
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ASR Rd, Rs | ASR (2) | 


Description This form of the ASR (Arithmetic Shift Right) instruction is used to provide sp aCe 
: : ate (register) 
the signed value of a register divided by a constant power of 2. 


In this case, ASR performs an arithmetic shift right of the value of register Rd by 
the value in the least-significant byte of register Rs, and stores the result back into 
the register Rd. The sign bit of the original Rd (Rd[81]) is inserted into the vacated 
bit positions. The condition code flags are updated (based on the result). 


Architecture v4T only 


15 14 13 12 11 10 9 8 7 6 5 3 2 0 
Operation if Rs[7:0] == 0 then 
C Flag = unaffected 


Rd = unaffected 
else if Rs[7:0] < 32 then 

C Flag = Rd[Rs[7:0] - 1] 

Rd = Rd Arithmetic_Shift_Right Rs[7:0] 
else /* Rs[7:0] >= 32 */ 


if Rd[31] == 0 then 
Cc Flag = Rd[31] 
Rd = 0 
else /* Rd[31] == 1 */ 


C Flag = Rd[31] 
Rd = Oxffffffft 


Exceptions None 


Qualifiers None 


Equivalent ARM syntax and encoding 
MOVS Rd, Rd, ASR Rs 


3130 29 28 27 26 25 24 23 22 21 20 19 16 15 12.~= 11 
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Conditional branch Description This form of the B (Branch) instruction provides conditional changes to program 
flow. 


In this case, B causes a conditional branch to a target address. The branch target 
address is calculated by shifting the 8-bit signed offset left by one bit, 
sign-extending the result to 32 bits, and adding this to the contents of the PC 
(which contains the address of the branch instruction plus 4). The instruction can 
therefore specify a branch of +/- 256 bytes. 


The instruction is only executed if the condition specified in the instruction matches 
the condition code status. The conditions are defined in 3.3 The Condition Field on 


Architecture v4T only 


page 3-4. 
15 14 13 12 11 8 7 0 
a ae 8 bit signed _ofiset 
Operation if ConditionPassed(<cond>) then 


PC = PC + (SignExtend(<8_bit_signed_offset>) << 1) 





Exceptions None 
Qualifiers Condition Code 


Notes Offset calculation: An assembler will calculate the branch offset address from 
the difference between the address of the current instruction and the address 
of the target (given as a program label) minus four (because the PC holds 
the address of the current instruction plus four). 


Memory bounds: Branching backwards past location zero and forwards over 
the end of the 32-bit address space is UNPREDICTABLE. 


Equivalent ARM syntax and encoding 


B<cond> <target_address> 


31 28 27 26 25 24 23 8 


7 0 
sign extension of 8_bit_signed_offset 8 _bit_signed_offset 
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B <target_address> | BQ) & 


Description This form of the B (Branch) instruction provides unconditional changes to program Unconditional branch 
flow. 


In this case, B causes an unconditional branch to a target address. The branch ; 

target address is calculated by shifting the 11-bit signed (two’s complement) offset 4/”//tecture v4T only 
left one bit, sign-extending the result to 32 bits, and adding this to the contents of 

the PC (which contains the address of the branch instruction plus 4). 

The instruction can therefore specify a branch of +/- 2048 bytes. 





15 14 13 12 11 10 0 
Operation PC = PC + (SignExtend(<1ll_bit_signed_offset>) << 1) 


Exceptions None 
Qualifiers None 


Notes Offset calculation: An assembler will calculate the branch offset address from 
the difference between the address of the current instruction and the address 
of the target (given as a program label) minus four (because the PC holds 
the address of the current instruction plus four). 


Memory bounds: Branching backwards past location zero and forwards over 
the end of the 32-bit address space is UNPREDICTABLE. 


Equivalent ARM syntax and encoding 


B <target_address> 


28 27 26 25 24 23 12 11 10 





Lae are sign extension of 11_bit_signed_offset 11_bit_signed_offset 
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a ae BIC Rd, Rm 


Bit clear Description The BIC (Bit Clear) instruction can be used to clear selected bits in a register. 
For each bit, BIC with 1 clears the bit, and BIC with 0 leaves it unchanged. 


BIC performs a bitwise AND of the complement of the value of register Rm with the 
value of register Rd, and stores the result back in register Rd. The condition code 
flags are updated (based on the result). 


Architecture v4T only 


15 14 13 12 11 10 9 8 7 6 5 3 2 0 
Operation Rd = Rd AND NOT Rm 
N Flag = Rd[31] 
Z Flag = if Rd == 0 then 1 else 0 





C Flag = <shifter_carry_out> 
V Flag = unaffected 


Exceptions None 


Qualifiers None 


Equivalent ARM syntax and encoding 
BICS Rd, Rd, Rm 


31 30 29 28 27 26 25 24 23 22 21 20 19 16 15 12 11 10 9 
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BL <target_address> | BLE 


Description The BL (Branch with Link) instruction provides an unconditional subroutine call; Branch with link 
the return from subroutine is achieved by copying the LR to the PC (se page 6-57). 
BL causes an unconditional subroutine call to a target address, and stores 
the return address into the LR (link register or R14). Thumb subroutine calls are Architecture v4T only 
made using a two-instruction sequence: 


1 The first instruction (H==0) sign-extends the value of 
<11_bit_signed_offset>, shifts the result left by 12 bits, adds the value 
of the PC (the address of the branch instruction plus 4), and stores the result 
in LR. 


2 The second instruction shifts the value of <11_bit_signed_offset> left 
by one bit, adds the value of LR (that was calculated by the first instruction), 
stores the result in the PC, places the address of the instruction after 
the second BL in LR. 


The instruction can therefore specify a branch of +/- 4 Mbytes. 








15 14 13 12 11 10 0 
Operation if H== 
LR = PC + (SignExtend(<1l_bit_signed_offset>) << 12) 
else /* H==1 */ 
<return_address> = PC + 2 | 1 
PC = LR + (<1ll_bit_signed_offset> << 1) 
LR = <return_address> 





Exceptions None 
Qualifiers None 


Notes Memory bounds: Branching backwards past location zero and forwards over 
the end of the 32-bit address space is UNPREDICTABLE. 


Offset calculation: An assembler will calculate the branch offset address from 
the difference between the address of the current instruction and the address 
of the target (given as a program label) minus four (because the PC holds 
the address of the current instruction plus four). 


Equivalent ARM syntax and encoding 
BL <target_address> 


28 27 26 25 24 23 22 21 


offset 
En 7 ai cinta se 
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Branch with exchange __ Description The BX (Branch and Exchange) instruction is used to branch between ARM code 
and Thumb code. 


BX branches and selects the instruction set decoder to use to decode 
the instructions at the branch destination. The branch target address is the value 
of register Rm. The T flag is updated with bit 0 of the value of register Rm. 


Architecture v4T only 


15 14 13 12 11 10 9 8 7 6 5 3 2 0 
Operation T Flag = Rm[0] 


PC = Rm[31:1] << 1 


Exceptions None 


Qualifiers Condition Code 

Notes The H2 bit: This bit is the high register specifier: 
H2 == indicates that Rm specifies a low register 
H2 == indicates Rm specifies a high register 


Tranferring to Thumb: When transferring to the Thumb instruction set, bit[0] of Rm 
will be set to zero when transferred to the PC. 


Transferring to ARM: When transferring to the ARM instruction set, bit[1:0] of Rm 
will be set to zero when transferred to the PC. 


Equivalent ARM syntax and encoding 
BX Rm 


31 30 29 28 27 26 25 24 23 22 21 20 19 16 15 12 11 8 7 6 5 4 3 0 
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Compare negative 


Description The CMN (Compare Negative) instruction compares an arithmetic value and (register) 


the negative of an arithmetic value and sets the condition code flags so that 
subsequent instructions can be conditionally executed (using a conditional branch 
instruction). Architecture v4T only 
CMN performs a comparison by adding (or subtracting the negative of) the value 


of register Rm to (from) the value of register Rd. The condition code flags are 
updated (based on the result). 





15 14 13 12 11 10 9 8 7 6 5 3 2 0 
Operation <alu_out> = Rn + Rm 
N Flag = <alu_out>[31] 
Z Flag = if <alu_out> == 0 then 1 else 0 
C Flag = NOT BorrowFrom(Rn + Rm) 


V Flag = OverflowFrom (Rn + Rm) 
Exceptions None 


Qualifiers None 


Equivalent ARM syntax and encoding 
CMN Rn, Rm 


3130 29 28 27 26 25 24 23 22 21 20 19 16 15 AQ: A: “AD. 59) 
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= CMP(1)_ CMP Rn, #<8_bit_immediate> 


Compare immediate Description This form of the CMP (Compare) instruction compares two arithmetic values and 
sets the condition code flags so that subsequent instructions can be conditionally 
executed (using a conditional branch instruction). 


AUR RO IE In this case, CMP performs a comparison by subtracting the value of the 8-bit 


immediate (values 0 to 255) from the value of register Rd. The condition code flags 
are updated (based on the result). 





15 14 13 12 11 10 8 7 0 
a oe at ee @ bit immediate 
Operation <alu_out> = Rn - <8_bit_immed> 
N Flag = <alu_out>[31] 
Z Flag = if <alu_out> == 0 then 1 else 0 
C Flag = NOT BorrowFrom(Rn — <8_bit_immed>) 


V Flag = OverflowFrom (Rn -— <8_bit_immed>) 
Exceptions None 


Qualifiers None 


Equivalent ARM syntax and encoding 
CMP Rn, #<8_bit_immediate> 


31 30 29 28 27 26 25 24 23 22 21 20 19 16 15 12 11 10 9 
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Description This form of the CMP (Compare) instruction compares two arithmetic values and Compare register 
sets the condition code flags so that subsequent instructions can be conditionally 
executed (using a conditional branch instruction). 


In this case, CMP performs a comparison by subtracting the value of register Rm 
from the value of register Rd. The condition code flags are updated (based on 


Architecture v4T only 


the result). 
15 14 13 12 11 10 9 8 7 6 5 3 2 0 
Operation <alu_out> = Rn - Rm 
N Flag = <alu_out>[31] 
Z Flag = if <alu_out> == 0 then 1 else 0 





C Flag = NOT BorrowFrom(Rn — Rm) 
V Flag = OverflowFrom (Rn — Rm) 


Exceptions None 


Qualifiers None 


Equivalent ARM syntax and encoding 
CMP Rd, Rm 


3130 29 28 27 26 25 24 23 22 21 20 19 16 15 AQ: A: “AD. 59) 
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= CMP (3) CMP Rn, Rm 


Compare 


Space Description This form of the CMP (Compare) instruction compares two arithmetic values in 


the high registers. 

Architecture v4T only Inthis-case CMe: 

* compares the value of a low register and a high register (H1=1, H2=0), or 
* compares the value of a high register and a low register (H1=0, H2=1), or 


* compares the value of a high register and another high register 
(H1=1, H2=1) and sets the condition code flags so that subsequent 
instructions can be conditionally executed (using a conditional branch 


instruction) 
15 14 13 12 11 10 9 8 7 6 5 3 2 0 
Operation <alu_out> = Rn - Rm 
N Flag = <alu_out>[31] 
Z Flag = if <alu_out> == 0 then 1 else 0 





C Flag = NOT BorrowFrom(Rn — Rm) 
V Flag = OverflowFrom(Rn — Rm) 


Exceptions None 
Qualifiers None 


Notes Operand restriction: If alow register is specified for Rd and Rm (H1=0 and H2=0), 
the result is UNPREDICTABLE. 


Equivalent ARM syntax and encoding 
CMP Rn, Rm 


31 30 29 28 27 26 25 24 23 22 21 20 19 18 16 15 14 12 11 10 9 
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EOR Rd, Rm | EO i 


Description The EOR (Exclusive OR) instruction can be used to invert selected bits in Exclusive OR 
a register. For each bit, EOR with 1 will invert that bit, and EOR with 0 will leave it 
unchanged. 


EOR performs a bitwise Exclusive OR of the value of register Rm with the value of — A/hitecture v4T only 
register Rd, and stores the result back in register Rd. The condition code flags are 
updated (based on the result). 





15 14 13 12 11 10 9 8 7 6 5 3 2 0 
Operation Rd = Rd EOR Rm 
N Flag = Rd[31] 
Z Flag = if Rd == 0 then 1 else 0 





C Flag = <shifter_carry_out> 
V Flag = unaffected 


Exceptions None 


Qualifiers None 


Equivalent ARM syntax and encoding 
EORS Rd, Rd, Rm 


3130 29 28 27 26 25 24 23 22 21 20 19 16 15 2 AA: “AD 2.59) 


ARM Architecture Reference Manual 6-39 
ARM DUI 0100B 


Ml POWERED 


ARM 


Z 


Zz i D Y zi LDMIA Rn!, <register_list> 


pecan Description The LDM instruction is useful as a block load instruction. Combined with STM 
(store multiple), it allows efficient block copy. 


The LDMIA (Load Multiple Increment After) instruction loads a subset (or possibly 
all) of the general-purpose registers from sequential memory locations. 
The registers are loaded in sequence: 
¢ — the lowest-numbered register first, from the lowest memory address 
(<start_address>) 


Architecture v4T only 


¢ — the highest-numbered register last, from the highest memory address 
(<end_address>) 


The <start_address> is the value of the base register Rn. 


Subsequent addresses are formed by incrementing the previous address by four. 
One address is produced for each register that is specified in <register_list>. 


The <end_address> value is four less than the sum of the value of the base 
register and four times the number of registers specified in <register_list>. 


Finally, the base register Rn is incremented by four times the numbers of registers 
in <register_list>. 


15 14 13 12 11 10 8 7 0 


1 1 0 0 1 register_list 


Operation 
<start_address> = Rn 
<end_address> = Rn + (Number_Of_Set_Bits_In(<register_list>) * 4) - 4 
<address> = <start_address> 
for i= 0to/7 
if <register_list>[i] == 1 
Ri = Memory[<address>, 4] 
<address> = <address> + 4 
assert <end_address> == <address> 4 
Rn = Rn + (Number_Of_Set_Bits_In(<register_list>) * 4) 








Equivalent ARM syntax and encoding 
LDMIA Rn!, <register_list> 


31 30 29 28 27 26 25 24 23 22 21 20 19 16 15 14 13 12 11 10 9 


PE Ee pe la eae ai = 
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ss ti“ (tC 


Exceptions 


Qualifiers 


Notes 


ARM 


Ml POWERED 


Z 


Load multiple 
Data Abort increment after 


None 


Register Rn: Specifies the base register used by <addressing_mode>. ee een, 


Operand restrictions: If the base register Rn is specified in <register_list>, 
the final value of Rn is the loaded value (not the written-back value). 


Data Abort: If a data abort is signalled, the value left in Rn is IMPLEMENTATION 
DEFINED, but is either the original base register value or the updated base 
register value (even if Rn is specified in <register_list>). 

Non-word-aligned addresses: Load multiple instructions ignore the least-significant 
two bits of <address> (the words are not rotated as for load word). 


Alignment: If an implementation includes a System Control Coprocessor 
(see Chapter 7, System Architecture and System Control Coprocessor), and 
alignment checking is enabled, an address with bits[1:0] != 0b00 will cause 
an alignment exception. 


ARM Architecture Reference Manual 6-41 
ARM DUI 0100B 


= LDR(1) LDR Rd, [Rn, #5_bit_offset]) 


Load word 


‘ : Description This form of the LDR (Load Register) instruction allows 32-bit memory data to be 
immediate offset 


loaded into a general-purpose register where its value can be manipulated. 
The addressing mode is useful for accessing structure (record) fields. 

Architecture v4T only With an offset of zero, the address produced is the unaltered value of the base 
register Rn. 


In this case, LDR loads a word from memory and writes it to register Rd. 

The memory address is calculated by adding 4 times the value of 
<5_bit_offset> to the value of register Rn. If the address is not word-aligned, 
the result is UNPREDICTABLE. 


15 14 13 12 11 10 6 5 3 2 0 
Operation <address> = Rn + (5_bit_offset * 4) 
if <address>[1:0] == 0b00 


<data> = Memory[<address>, 4] 
else 

<data> = UNPREDICTABLE 
Rd = <data> 


Exceptions Data Abort 
Qualifiers None 


Notes Alignment: If an implementation includes a System Control Coprocessor 
(see Chapter 7, System Architecture and System Control Coprocessor), and 
alignment checking is enabled, an address with bits[1:0] != 0b00 will cause 
an alignment exception. 


Equivalent ARM syntax and encoding 
LDR Rd, [Rn, #5_bit_offset] 


31 30 29 28 27 26 25 24 23 22 21 20 19 16 15 12:11 10 9 
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LDR Rd, [Rn, Rm] | LDR (2) | 





Description This form of the LDR (Load Register) instruction allows 32-bit memory data to be Cana 
loaded into a general-purpose register where its value can be manipulated. 9 
The addressing mode is useful for pointer + large offset arithmetic (use MOV 
immediate to set the offset), and accessing a single element of an array. Architecture v4T only 
In this case, LDR loads a word from memory and writes it to register Rd. 
The memory address is calculated by adding the value of register Rm to the value 
of register Rn. If the address is not word aligned, the result is UNPREDICTABLE. 
15 14 13 12 11 10 9 8 6 5 3 2 0 
Operation <address> = Rn + Rm 
if <address>[1:0] == 0b00 
<data> = Memory[<address>, 4] 
else 
<data> = UNPREDICTABLE 
Rd = <data> 
Exceptions Data Abort 
Qualifiers None 
Notes Alignment: If an implementation includes a System Control Coprocessor 
(see Chapter 7, System Architecture and System Control Coprocessor), and 
alignment checking is enabled, an address with bits[1:0] != 0b00 will cause 
an alignment exception. 
Equivalent ARM syntax and encoding 
LDR Rd, [Rn, Rm] 
31 30 29 28 27 26 25 24 23 22 21 20 19 16 15 12 11:10 9 
pees] ew [ooo same [ee 
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= LDR(3) LDR Rd, [PC, #8 _bit_offset] 


Load word 


PC. : Description This form of the LDR (Load Register) instruction allows 32-bit memory data to be 
-relative 


loaded into a general-purpose register where its value can be manipulated. 
The addressing mode is useful for accessing PC relative data. 


Bese ee ay, In this case, LDR loads a word from memory and writes it to register Rd. 


The memory address is calculated by adding 4 times the value of 
<8_bit_offset> to the value of the PC. If the address is not word-aligned, 
the result is UNPREDICTABLE. 


15 14 13 12 11 10 8 7 0 


Operation <address> = (PC[31:1] << 1) + (8_bit_offset * 4) 
if <address>[1:0] == 0b00 


<data> = Memory[<address>, 4] 
else 

<data> = UNPREDICTABLE 
Rd = <data> 


Exceptions Data Abort 
Qualifiers None 


Notes Alignment: If an implementation includes a System Control Coprocessor 
(see Chapter 7, System Architecture and System Control Coprocessor), and 
alignment checking is enabled, an address with bits[1:0] != 0b00 will cause 
an alignment exception. 


Equivalent ARM syntax and encoding 
LDR Rd, [PC, #8_bit_offset] 


31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 12 11 10 9 
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LDR Rd, [SP, #8_bit_offset] LDR (C)) 


Description This form of the LDR (Load Register) instruction allows 32-bit memory data to be poe ats 
‘ : : ‘ -relative 
loaded into a general-purpose register where its value can be manipulated. 
The addressing mode is useful for accessing stack data. 


In this case, LDR loads a word from memory and writes it to register Rd. ABIES IE IIIT 
The memory address is calculated by adding 4 times the value of 

<8_bit_offset> to the value of the SP. If the address is not word-aligned, 

the result is UNPREDICTABLE. 





15 14 13 12 11 10 8 Z 0 
Operation <address> = SP + (8_bit_offset * 4) 
if <address>[1:0] == 0b00 


<data> = Memory[<address>, 4] 
else 

<data> = UNPREDICTABLE 
Rd = <data> 


Exceptions Data Abort 
Qualifiers None 


Notes Alignment: If an implementation includes a System Control Coprocessor 
(see Chapter 7, System Architecture and System Control Coprocessor), and 
alignment checking is enabled, an address with bits[1:0] != 0b00 will cause 
an alignment exception. 


Equivalent ARM syntax and encoding 
LDR Rd, [SP, #8 _bit_offset] 


31.30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 12,1110 9 
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Thumb 


LDRB a) LDRB Rd, [Rn, #5_bit offset] 


Load unsigned byte 


3 ‘ Description This form of the LDRB (Load Register Byte) instruction allows 8-bit memory data 
immediate offset 


to be loaded into a general-purpose register where its value can be manipulated. 
The addressing mode is useful for accessing structure (record) fields. 

Architecture v4T only With an offset of zero, the address produced is the unaltered value of the base 
register Rn. 


This form of the LDRB: 

1 loads a byte from memory 

2 _zero-extends the byte to a 32-bit word 
3 writes the word to register Rd 


The memory address is calculated by adding the value of <5_bit_offset> 
to the value of register Rn. 


15 14 13 12 11 10 6 5 3 2 0 
: cebleiiais pom fm | 
Operation <address> = Rn + 5 _bit_offset 


Rd = Memory [<address>,1] 
Exceptions Data Abort 


Qualifiers None 


Equivalent ARM syntax and encoding 
LDRB Rd, [Rn, #5_bit_offset] 


31 30 29 28 27 26 25 24 23 22 21 20 19 16 15 12 11 10 9 
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ee LDRB (2) 


Description This form of the LDRB (Load Register Byte) instruction allows 8-bit memory data EO UN SOp OE PS 
: : : : register offset 
to be loaded into a general-purpose register where its value can be manipulated. 
The addressing mode is useful for pointer + large offset arithmetic (use the MOV 
immediate to set the offset), and accessing a single element of an array. Architecture v4T only 


In this case, LDRB: 

1 loads a byte from memory 

2 zero-extends the byte to a 32-bit word 

3. writes the word to register Rd 

The memory address is calculated by adding the value register Rm to the value of 





register Rn. 
15 14 13 12 11 10 9 8 6 5 3 2 0 
Operation <address> = Rn + Rm 


Rd = Memory [<address>,1] 
Exceptions Data Abort 


Qualifiers None 


Equivalent ARM syntax and encoding 
LDRB Rd, [Rn, Rm] 


3130 29 28 27 26 25 24 23 22 21 20 19 16 15 2 AA: “AD 2.59) 
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= LDRH(1)_ LDRH Rd, [Rn, #5_bit offset] 


Load unsigned halfword 


: Description This form of the LDRH (Load Register Halfword) instruction allows 16-bit memory 
Immediate offset 


data to be loaded into a general-purpose register where its value can be 

manipulated. The addressing mode is useful for accessing structure (record) 
Architecture v4T only fields. With an offset of zero, the address produced is the unaltered value of 

the base register Rn. 

In this case, LDRH: 

1 loads a halfword from memory 

2 zero-extends the halfword to a 32-bit word 

3. writes the word to register Rd 

The memory address is calculated by adding 2 times the value of 


<5_bit_offset> to the value of register Rn. If the address is not 
halfword-aligned, the result is UNPREDICTABLE. 


15 14 13 12 1 10 6 5 3 2 0 
; 2 . oe pom fm 
Operation <address> = Rn + (5_bit_offset * 2) 


if <address>[1:0] == 

<data> = Memory[<address>, 2] 
else 

<data> = UNPREDICTABLE 
Rd = <data> 


Exceptions Data Abort 
Qualifiers None 


Notes Alignment: If an implementation includes a System Control Coprocessor 
(see Chapter 7, System Architecture and System Control Coprocessor), and 
alignment checking is enabled, an address with bit[0] != 0 will cause 
an alignment exception. 


Equivalent ARM syntax and encoding 
LDRH Rd, [Rn, #5_bit_offset] 


31 30 29 28 27 26 25 24 23 22 21 20 19 16 15 12 11 10 9 


pr ofeeettsos] m | om [ooepfior a, eee a 
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LDRH Rd, [Rn, Rm] | LDRH (2) | 


Description This form of the LDRH (Load Register Halfword) instruction allows 16-bit memory 
data to be loaded into a general-purpose register where its value can be 
manipulated. The addressing mode is useful for pointer + large offset arithmetic 
(use MOV immediate to set the offset), and accessing a single element of an array. 


In this case, LDRH: 

1 loads a halfword from memory 

2 zero-extends the halfword to a 32-bit word 
3 writes the word to register Rd 


The memory address is calculated by adding the value of register Rm to the value 
of register Rn. If the address is not halfword-aligned, the result is UNPREDICTABLE. 


15 14 13 12 11 10 9 8 6 5 3 2 0 
Operation <address> = Rn + Rm 


if <address>[0] == 

<data> = Memory[<address>, 2] 
else 

<data> = UNPREDICTABLE 
Rd = <data> 


Exceptions Data Abort 
Qualifiers None 


Notes Alignment: If an implementation includes a System Control Coprocessor 
(see Chapter 7, System Architecture and System Control Coprocessor), and 
alignment checking is enabled, an address with bit[0] != 0 will cause 
an alignment exception. 


Equivalent ARM syntax and encoding 
LDRH Rd, [Rn, Rm] 


31.30 29 28 27 26 25 24 23 22 21 20 19 16 15 12.11 
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Load unsigned halfword 
register offset 


Architecture v4T only 


6-49 


| D) ats) = LDRSB Rd, [Rn, Rm] 


Load signed byte 
register offset 


Thumb 


Description The LDRSB (Load Register Signed Byte) instruction allows 8-bit signed memory 
data to be loaded into a general-purpose register where its value can be 
manipulated. The addressing mode is useful for pointer + large offset arithmetic 

Architecture v4T only (use the MOV immediate to set the offset), and accessing a single element of 

an array. 


LDRSB: 

1 loads a byte from memory 

2 __ sign-extends the byte to a 32-bit word 
3 writes it to register Rd 


The memory address is calculated by adding the value register Rm to the value of 
register Rn. 


10 9 8 6 5 3 2 0 
Operation <address> = Rn + Rm 
Rd = SignExtend (Memory [<address>,1]) 





Exceptions Data Abort 


Qualifiers None 


Equivalent ARM syntax and encoding 
LDRSB Rd, [Rn, Rm] 


31 30 29 28 27 26 25 24 23 22 21 20 19 16 15 12 11 
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LDRSH Rd, [Rn, Rm] | LDRSH | 








Description The LDRSH (Load Register Signed Halfword) instruction allows 16-bit signed pe aoa ch 
memory data to be loaded into a general-purpose register where its value can be g 
manipulated. The addressing mode is useful for pointer + large offset arithmetic 
(use the MOV immediate to set the offset), and accessing a single element of Architecture v4T only 
an array. 

LDRSH: 

1 loads a halfword from memory 

2 __ sign-extends the halfword to a 32-bit word 

3 writes the word to register Rd 

The memory address is calculated by adding the value register Rm to the value of 

register Rn. If the address is not halfword-aligned, the result is UNPREDICTABLE. 
15 14 13 12 11 10 9 8 6 5 3 2 0 

Operation <address> = Rn + Rm 

if <address>[1:0] == 

<data> = Memory [<address>, 2] 
else 

<data> = UNPREDICTABLE 
Rd = SignExtend (<data>) 

Exceptions Data Abort 

Qualifiers None 

Notes Alignment: If an implementation includes a System Control Coprocessor 

(see Chapter 7, System Architecture and System Control Coprocessor), and 
alignment checking is enabled, an address with bit[0] != 0 will cause 
an alignment exception. 

Equivalent ARM syntax and encoding 
LDRSH Rd, [Rn, Rm 

31 30 29 28 27 26 25 24 23 22 21 20 19 16 15 12 =11 
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= LSL(1) _ LSL Rd, Rm, #<shift_imm> 


Sa sarah Description This form of the LSL (Logical Shift Left) instruction is used to provide either 
the value of a register directly (LSL #0), or the value of a register multiplied by 
a constant power of two. 


AUR EO SeIT In this case, LSL performs a logical shift left of the value of register Rm by 


an immediate value in the range 0 to 31 and stores the result into the destination 
register Rd. Zeros are inserted into the vacated bit positions. The condition code 
flags are updated (based on the result). 


15 14 13 12 11 10 6 5 3 2 0 
Operation if <shift_imm> == 


C Flag = UNAFFECTED 
Rd = UNAFFECTED 
else /* <shift_imm> > 0 */ 
C Flag = Rm[32 - <shift_imm>] 
Rd = Rd Logical_Shift_Left <shift_imm> 
N Flag = Rd[31] 
Z Flag = if Rd == 0 then 1 else 0 
V Flag = unaffected 





Exceptions None 


Qualifiers None 


Equivalent ARM syntax and encoding 
MOVS Rd, Rm, LSL #<shift_imm> 


31 30 29 28 27 26 25 24 23 22 21 20 19 16 15 12 11 
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LSL Rd, Rs | LSL(2) § 


This form of the LSL (Logical Shift Left) instruction is used to provide the unsigned ae ect 
value of a register multiplied by a variable (in a register) power of two. Z 


In this case, LSL instruction performs a logical shift left of the value of register Rd ’ 

by the value in the least-significant byte of register Rs and stores the result back ARIE EIEN INET 
into the register Rd. Zeros are inserted into the vacated bit positions. The condition 

code flags are updated (based on the result). 


15 14 13 12 11 10 9 8 7 6 5 3 2 0 
Operation if Rs[7:0] == 0 
C Flag = UNAFFECTED 


Rd = UNAFFECTED 




















else if Rs[7:0] < 32 then 
C Flag = Rd[32 - Rs[7:0]] 
Rd = Rd Logical_Shift_Left Rs[7:0] 
else if Rs[7:0] == 32 then 
C Flag = Rd[0] 
Rd = 0 
else /* Rs[7:0] > 32 */ 
C Flag = 0 
Rd = 0 
N Flag = Rd[31 
Z Flag = if Rd == 0 then 1 else 0 





V Flag = unaffected 
Exceptions None 


Qualifiers None 


Equivalent ARM syntax and encoding 
MOVS Rd, Rd, LSL Rs 


31.30 29 28 27 26 25 24 23 22 21 20 19 16 15 12.~= 11 
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= LSR(1)_ LSR Rd, Rm, #<shift_imm> 


Logical shift right 


Rant) Description This form of the LSR (Logical Shift Right) instruction is used to provide the value 


of a register, divided by a constant power of two. 

In this case, LSR performs a logical shift right of the value of register Rm by 

an immediate value in the range 1 to 32, and stores the result into the destination 
register Rd. Zeros are inserted into the vacated bit positions. 

A shift by 32 is encoded by: 


<shift_imm> = 0 


Architecture v4T only 


The condition code flags are updated (based on the result). 


15 14 13 12 11 10 6 5 3 2 0 
Operation if <shift_imm> == 0 
C Flag = Rd[31] 
Rd = 0 


else /* <shift_imm> > 0 */ 

C Flag = Rd[<shift_imm> - 1] 

Rd = Rd Logical_Shift_Right <shift_imm> 
N Flag = Rd[31] 
Z Flag = if Rd == 0 then 1 else 0 
V Flag = unaffected 








Exceptions None 


Qualifiers None 


Equivalent ARM syntax and encoding 
MOVS Rd, Rm, LSR #<shift_imm> 


31 30 29 28 27 26 25 24 23 22 21 20 19 16 15 12 11 





6-54 ARM Architecture Reference Manual 
ARM DUI 0100B 


a 
a 
Fa 
ag 
= 
=) 
a 
7 


ARM 





LSR Rd, Rs | LSL(2) 2 


Description This form of the LSR (Logical Shift Right) instruction is used to provide 
the unsigned value of a register divided by a variable (in a register) power of two. 


In this case, LSR instruction performs a logical shift right of the value of register Rd 
by the value in the least-significant byte of register Rs, and stores the result back 
into the register Rd. Zeros are inserted into the vacated bit positions. The condition 
code flags are updated (based on the result). 




















15 14 13 12 11 10 9 8 7 6 5 3 2 0 
Operation if Rs[7:0] == 0 then 
C Flag = unaffected 
Rd = unaffected 
else if Rs[7:0] < 32 then 
C Flag = Rd[Rs[7:0] - 1] 
Rd = Rd Logical_Shift_Right Rs[7:0] 
else if Rs[7:0] == 32 then 
C Flag = Rd[31] 
Rd = 0 
else /* Rs[7:0] > 32 */ 
C Flag = 0 
Rd = 0 
N Flag = Rd[31 
Z Flag = if Rd == 0 then 1 else 0 





V Flag = unaffected 
Exceptions None 


Qualifiers None 


Equivalent ARM syntax and encoding 
MOVS Rd, Rd, LSR Rs 


3130 29 28 27 26 25 24 23 22 21 20 19 16 15 12.~= 11 
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Z 


Logcal shift right 
(register) 


Architecture v4T only 


6-55 


= MOV(1) MOV Rd, #<8_bit_immediate> 


Move immediate Description This form of the MOV (Move) instruction moves a large constant value to a 
register. 


In this case, MOV writes the value of the 8-bit immediate (values 0 to 255) 


AUN A al enih to the destination register Rd. The condition code flags are updated (based on 


the result). 
15 14 13 12 11 10 8 7 0 
Operation Rd = <8_bit_immed> 
N Flag = Rd[31] 
Z Flag = if Rd == 0 then 1 else 0 





C Flag = unaffected 
V Flag = unaffected 


Exceptions None 


Qualifiers None 


Equivalent ARM syntax and encoding 
MOV Rn, #<8_ bit _immediate> 


31 30 29 28 27 26 25 24 23 22 21 20 19 16 15 12 11 10 9 
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MOV Rd, Rm ie Y OV (2) 5 


Description This form of the MOV (Move) instruction is used to move a value to, from, or Move high registers 
between high registers. 


In this case, MOV : 
* moves the value of a low register to a high register (H1=1, H2=0), or 
* moves the value of a high register to a low register (H1=0, H2=1), or 
* — moves the value of a high register and another high register 
(H1=1, H2=1). 


The subroutine return instruction is: 
MOV PC, LR 


Architecture v4T only 


(after executing a BL sequence; see page 6-33). 


15 14 13 12 11 10 9 8 7 6 5 3 2 0 
Operation Rd = Rm 


Exceptions None 
Qualifiers None 


Notes Operand restriction: If alow register is specified for Rd and Rm (H1=0 and H2=0), 
the result is UNPREDICTABLE. 


Equivalent ARM syntax and encoding 
MOV Rd, Rm 


31.30 29 28 27 26 25 24 23 22 21 20 19 18 16.15 14 12,1110 9 
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= MUL MUL Rd, Rm 


Multiply Description The MUL (Multiply) instruction multiplies signed or unsigned variables to produce 
a 32-bit result. 


MUL multiplies the value of register Rm with the value of register Rd, and stores 


AUR OE AIT the result back in the register Rd. The condition code flags are updated (based on 





the result). 
15 14 13 12 11 10 9 8 7 6 5 3 2 0 
Operation Rd = (Rm * Rd) [31:0] 
N Flag = Rd[31] 
Z Flag = if Rd == 0 then 1 else 0 


C Flag = UNPREDICTABLE 
V Flag = UNAFFECTED 


Exceptions None 
Qualifiers None 
Notes Operand restriction: Specifying the same register for Rd and Rm has 


UNPREDICTABLE results. 


Early termination: If the multiplier implementation supports early termination, 
it must be implemented on the value of the Rd operand. The type of early 
termination used (signed or unsigned) is IMPLEMENTATION DEFINED. 


Signed and unsigned: Because the MUL instruction produces only the lower 
32 bits of the 64-bit product, MUL gives the same answer for multiplication of 
both signed and unsigned numbers. 


Equivalent ARM syntax and encoding 
MULS Rd, Rm, Rd 


31 30 29 28 27 26 25 24 23 22 21 20 19 16 15 12 11 
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MVN Rd, Rm | MVN 


Description The MVN (Move NOT) instruction is used to compliment a register value, often Move NOT (register) 
to form a bit mask. 


MVN writes the logical one’s compliment value of register Rn to the destination 





register Rd. The condition code flags are updated (based on the result). AMIEL SIEBER) VC S1087 
15 14 13 12 11 10 9 8 zs 6 5 3 2 0 
Operation Rd = NOT Rm 
N Flag = Rd[31] 
Z Flag = if Rd == 0 then 1 else 0 


C Flag = unaffected 
V Flag = unaffected 


Exceptions None 


Qualifiers None 


Equivalent ARM syntax and encoding 
MVNS Rd, Rm 


3130 29 28 27 26 25 24 23 22 21 20 19 16 15 2 AA: “AD 2.59) 


ARM Architecture Reference Manual 6-59 
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ARM 


Z 


= NEG NEG Rd, Rm 


Negate register Description The NEG (Negate) instruction negates the value of one register and stores 
the result in a second register. 


NEG subtracts the value of register Rn from zero, and stores the result in 


AMUSE ATI, the destination register Rd. The condition code flags are updated (based on 


the result). 
15 14 13 12 11 10 9 6 5 3 2 0 
Operation Rd = 0 - Rn 
N Flag = Rd[31] 
Z Flag = if Rd == 0 then 1 else 0 





C Flag = NOT BorrowFrom(0 — Rn) 
V Flag = OverflowFrom(0 — Rn) 


Exceptions None 


Qualifiers None 


Equivalent ARM syntax and encoding 
RSBS Rd, Rn, #0 


31 30 29 28 27 26 25 24 23 22 21 20 19 16 15 12 11 10 9 
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ORR Rd, Rm @) aia 


Description The ORR (Logical OR) instruction can be used to set selected bits in a register; Logical OR 
for each bit, ORR with 1 will set the bit, and ORR with 0 will leave it unchanged. 


ORR performs a bitwise (inclusive) OR of the value of register Rm with the value Anni seer 
of register Rd, and stores the result back in register Rd. The condition code flags POO EN ENGEL: 
are updated (based on the result). 








15 14 13 12 11 10 9 8 7 6 5 3 2 0 
Operation Rd = Rd OR Rm 
N Flag = Rd[31] 
Z Flag = if Rd == 0 then 1 else 0 
C Flag = <shifter_carry_out> 
V Flag = unaffected 


Exceptions None 


Qualifiers None 


Equivalent ARM syntax and encoding 
ORRS Rd, Rd, Rm 


3130 29 28 27 26 25 24 23 22 21 20 19 16 15 2 AA: “AD 2.59) 
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a4 @) a POP {<register_list>, <PC>} 


Thumb 


Pop multiple registers Description The POP (Pop Multiple Registers) instruction is useful for stack operations, 
including procedure exit, to restore saved registers, load the PC with the return 
address, and update the stack pointer. 


POP loads a subset (or possibly all) of the general-purpose registers and 
optionally the PC from sequential memory locations. Registers are loaded in 
sequence: 
¢ the lowest-numbered register first, from the lowest memory address 
(<start_address>) 


* — the highest-numbered register last, from the highest memory address 
(<end_address>) 


Architecture v4T only 


The <start_address> is the value of the SP. 


Subsequent addresses are formed by incrementing the previous address by four. 
One address is produced for each register that is specified in <register_list>. 


The <end_address> value is four less than the sum of the value of the SP and 
four times the number of registers specified in <register_list> (including 
the R bit). 


Finally, the base register Rn is incremented by four times the numbers of registers 
in <register_list> (plus the R bit). 














15 14 13 12 11 10 9 8 7 0 
Operation 
<start_address> = Rn 
<end_address> = Rn + (Number_Of_Set_Bits_In(<register_list> + R) * 4) - 4 
<address> = <start_address> 
for i= 0to/7 
if <register_list>[i] == 1 
Ri = Memory[<address>, 4] 
<address> = <address> 4 
if R == 
PC = Memory [<address>, 4] 
<address> = <address> 4 
assert <end_address> == <address> 4 


Rn = Rn + (Number_Of_Set_Bits_In(<register_list> + R) * 4) 


Equivalent ARM syntax and encoding 
LDMIA SP!, <register_list>, {PC} 


31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 





err ojrooororsii socal a gaia ai = 
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Exceptions 


Qualifiers 


Notes 


ARM 


im POWERED 


Z 


POP 





Data Abort Pop multiple registers 

None 

The R bit: If R == 1, the PC is also loaded from the stack; if R==0, the PC isnot — 4/chitecture v4T only 
loaded. 


Data Abort: If a data abort is signalled, the value left in SP is IMPLEMENTATION 
DEFINED, but is either the original SP value or the updated SP value. 
Non-word aligned addresses: Pop multiple instructions ignore the least-significant 
two bits of <address> (the words are not rotated as for load word). 
Alignment: If an implementation includes a System Control Coprocessor 
(see Chapter 7, System Architecture and System Control Coprocessor), and 
alignment checking is enabled, an address with bits[1:0] != 0b00 will cause 
an alignment exception. 


ARM Architecture Reference Manual 6-63 
ARM DUI 0100B 


Zz PUS i PUSH {<register_list>, <LR>} 


Push multiple registers The PUSH (Push Multiple Registers) instruction is useful for stack operations, 
including procedure entry, to save registers (optionally including the return 
address), and to update the stack pointer. 


PUSH stores a subset (or possibly all) of the general-purpose registers and 
optionally the LR to sequential memory locations. The registers are stored in 
sequence: 
¢ — the lowest-numbered register first, to the lowest memory address 
(<start_address>) 


Architecture v4T only 


« — the highest-numbered register last, to the highest memory address 
(<end_address>) 


The <start_address> is the value of the SP. 


Subsequent addresses are formed by incrementing the previous address by four. 
One address is produced for each register that is specified in <register_list>. 


The <end_address> value is four less than the sum of the value of the SP and 
four times the number of registers specified in <register_list> (including the 
R bit). 


Finally, the base register Rn is incremented by four times the numbers of registers 
in <register_list> (plus the R bit). 


15 14 13 12 11 10 9 8 7 0 
1 0 1 1 0 1 0 Ea register_list 
Operation 
<start_address> = Rn 
<end_address> = Rn + (Number_Of_Set_Bits_In(<register_list> + R) * 4) - 4 
<address> = <start_address> 


for i= 0to/7 
if <register_list>[i] == 
Memory [<address>, 4] 
<address> = <address 
if R == 
Memory [<address>,4] = LR 
<address> = <address> 4 
assert <end_address> == <address> 4 
Rn = Rn + (Number_Of_Set_Bits_In(<register_list> + R) * 4) 


1 
= Ri 
> 4 














Equivalent ARM syntax and encoding 
LDMDB SP!, <register_list>, {LR} 


31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 





uta of — vce volt so foal esa ea bei = 
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CO 


Exceptions Data Abort Push multiple registers 
Qualifiers None 
Notes The R bit: If R == 1, the LR is also stored to the stack; if R==0, the LRisnotstored.  4/chitecture v4T only 


Data Abort: If a data abort is signalled, the value left in SP is IMPLEMENTATION 
DEFINED, but is either the original SP value or the updated SP value. 
Non-word aligned addresses: Push multiple instructions ignore the least-significant 
two bits of <address> (the words are not rotated as for load word). 
Alignment: If an implementation includes a System Control Coprocessor 
(see Chapter 7, System Architecture and System Control Coprocessor), and 
alignment checking is enabled, an address with bits[1:0] != 0b00 will cause 
an alignment exception. 
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ROR ROR Rd, Rs 


Rotate right 
(register) 


Thumb 


Description The ROR (Rotate Right Register) instruction is used to provide the value of 
a register rotated by a variable value (in a register). 


ROR performs a rotate right of the value of register Rd by the value in 

the least-significant byte of register Rs, and stores the result back into register Rd. 
The bits that are rotated off the right end are inserted into the vacated bit positions 
on the left. The condition code flags are updated (based on the result). 


Achitecture v4T only 


15 14 13 12 11 10 9 8 7 6 5 3 2 0 
Operation if Rs[7:0] == 0 then 
C Flag = unaffected 
Rd = unaffected 
else if Rs[4:0] == 0 then 


C Flag = Rd[31] 
Rd = unaffected 
else /* Rs[4:0] > 0 */ 
C Flag = Rd[Rs[4:0] - 1] 
Rd = Rd Rotate_Right Rs[4:0] 


Exceptions None 


Qualifiers None 


Equivalent ARM syntax and encoding 
MOVS Rd, Rd, ROR Rs 


31 30 29 28 27 26 25 24 23 22 21 20 19 16 15 12 = 11 8 7 6 5 4 3 0 
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SBC Rd, Rm } Ss BCE 


Description § The SBC (Subtract with Carry) instruction is used to synthesize 64-bit subtraction. sue Car caey. 
If register pairs RO,R1 and R2,R3 hold 64-bit values (RO and R2 hold the Regieiet) 
least-significant word), the following instructions leave the 64-bit sum in RO,R1. 

SUB RO, R2 Architecture v4T only 


SBC R1,R3 


SBC subtracts the value of register Rm and the value of NOT(Carry Flag) from 
the value of register Rd, and stores the result in register Rd. The condition code 
flags are updated (based on the result). 


15 14 13 12 11 10 9 6 5 3 2 0 
Operation Rd = Rd - Rm —- NOT(C Flag) 
N Flag = Rd[31] 
Z Flag = if Rd == 0 then 1 else 0 





C Flag = CarryFrom(Rd -— Rm — NOT(C Flag) ) 
V Flag = OverflowFrom(Rd - Rm — NOT(C Flag) ) 


Exceptions None 


Qualifiers None 


Equivalent ARM syntax and encoding 
SBCS Rd, Rd, Rm 


3130 29 28 27 26 25 24 23 22 21 20 19 16 15 2 AA: “AD 2.59) 
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2 S™ STMIA Rn!, <register_list> 


Store multiple 


; Description The STM (Store Multiple) instruction is useful as a block store instruction. 
increment after 


Combined with LDM (load multiple), it allows efficient block copy. 


The STMIA (Store Multiple Increment After) instruction stores a subset (or possibly 
all) of the general-purpose registers to sequential memory locations. The registers 
are stored in sequence: 
¢ — the lowest-numbered register first, to the lowest memory address 
(<start_address>) 


¢ — the highest-numbered register last, to the highest memory address 
(<end_address>) 


Architecture v4T only 


The <start_address> is the value of the base register Rn. 


Subsequent addresses are formed by incrementing the previous address by four. 
One address is produced for each register that is specified in <register_list>. 


The <end_address> value is four less than the sum of the value of the base 
register and four times the number of registers specified in <register_list>. 


Finally, the base register Rn is incremented by 4 times the numbers of registers in 
<register_list>. 





15 14 13 12 11 10 8 7 0 
1 1 0 0 0 register_list 
Operation 
<start_address> = Rn 
<end_address> = Rn + (Number_Of_Set_Bits_In(<register_list>) * 4) - 4 
<address> = <start_address> 


for i= 0to/7 
if <register_list>[i] == 
Memory [<address>, 4] Ri 
<address> = <address> + 4 
assert <end_address> == <address> - 4 
Rn = Rn + (Number_Of_Set_Bits_In(<register_list>) * 4) 


1 








Equivalent ARM syntax and encoding 
STMIA Rn!, <register_list> 


31 30 29 28 27 26 25 24 23 22 21 20 19 16 15 14 13 12 11 10 9 


for ojr ooo ror of om pe la eae ai = 
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OE 


Exceptions 


Qualifiers 


Notes 


ARM 


Ml POWERED 


Z 


Store multiple 
Data Abort increment after 


None 


Register Rn: Specifies the base register used by <addressing_mode>. ANH EIST S087 


Operand restrictions: If the base register Rn is specified in <register_list>, 
and writeback is specified, the value of Rn stored for Rn is UNPREDICTABLE. 


Data Abort: If a data abort is signalled, the value left in Rn is IMPLEMENTATION 
DEFINED, but is either the original base register value or the updated base 
register value. 

Non-word-aligned addresses: Store multiple instructions ignore the 
least-significant two bits of <address> (the words are not rotated as for load 
word). 

Alignment: If an implementation includes a System Control Coprocessor 
(see Chapter 7, System Architecture and System Control Coprocessor), and 
alignment checking is enabled, an address with bits[1:0] != 0b00 will cause 
an alignment exception. 


ARM Architecture Reference Manual 6-69 
ARM DUI 0100B 


= STR(1)_ STR Rd, [Rn, #5_bit_offset]) 


Store word 


Pamedinin Gieer Description The STR (Store Register) instruction allows 32-bit data from a general-purpose 


register to be stored to memory. The addressing mode is useful for accessing 
structure (record) fields. With an offset of zero, the address produced is 
Architecture v4T only the unaltered value of the base register Rn. 


STR stores a word from register Rd to memory. The memory address is calculated 
by adding 4 times the value of <5_bit_offset> to the value of register Rn. 
If the address is not word-aligned, the result is UNPREDICTABLE. 


15 14 13 12 11 10 6 5 3 2 0 
Operation <address> = Rn + (5_bit_offset * 4) 
<data> = Rd 
if <address>[1:0] == 0b00 
Memory [<address>,4] = <data> 
else 
Memory [<address>,4] = UNPREDICTABLE 


Exceptions Data Abort 
Qualifiers None 


Notes Alignment: If an implementation includes a System Control Coprocessor 
(see Chapter 7, System Architecture and System Control Coprocessor), and 
alignment checking is enabled, an address with bits[1:0] != 0b00 will cause 
an alignment exception. 


Equivalent ARM syntax and encoding 
STR Rd, [Rn, #5 _bit_offset] 


31 30 29 28 27 26 25 24 23 22 21 20 19 16 15 12:11 10 9 
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STR Rd, [Rn, Rm] | STR) | 


Description This form of the STR (Store Register) instruction allows 32-bit data from eee 
a general-purpose register to be stored to memory. The addressing mode is useful g 
for pointer + large offset arithmetic (use the MOV immediate to set the offset), and 
for accessing a single element of an array. Architecture v4T only 


In this case, STR stores a word from register Rd to memory. The memory address 
is calculated by adding the value of register Rm to the value of register Rn. 
If the address is not word-aligned, the result is UNPREDICTABLE. 


15 14 13 12 11 10 9 8 6 5 3 2 0 
Operation <address> = Rn + Rm 
<data> = Rd 
if <address>[1:0] == 0b00 
Memory [<address>,4] = <data> 
else 
Memory [<address>,4] = UNPREDICTABLE 


Exceptions Data Abort 
Qualifiers None 


Notes Alignment: If an implementation includes a System Control Coprocessor 
(see Chapter 7, System Architecture and System Control Coprocessor), and 
alignment checking is enabled, an address with bits[1:0] != 0b00 will cause 
an alignment exception. 


Equivalent ARM syntax and encoding 
STR Rd, [Rn, Rm] 


3130 29 28 27 26 25 24 23 22 21 20 19 16 15 2 AA: “AD 2.59) 
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= STR() STR Rd, [SP, #8 bit offset] 


Store word 


SP-relative Description This form of the STR (Store Register) instruction allows 32-bit data from 


a general-purpose register to be stored to memory. The addressing mode is useful 
for accessing stack data. 


AUR BE aI In this case, STR stores a word from register Rd to memory. The memory address 


is calculated by adding 4 times the value of <8_bit_offset> to the value of 
the SP. If the address is not word-aligned, the result is UNPREDICTABLE. 


15 14 13 12 11 10 9 8 7 6 0 
Operation <address> = SP + (8_bit_offset * 4) 
<data> = Rd 
if <address>[1:0] == 0b00 
Memory [<address>,4] = Rd 
else 
Memory [<address>,4] = UNPREDICTABLE 


Exceptions Data Abort 
Qualifiers None 


Notes Alignment: If an implementation includes a System Control Coprocessor 
((see Chapter 7, System Architecture and System Control Coprocessor), and 
alignment checking is enabled, an address with bits[1:0] != 0b00 will cause 
an alignment exception. 


Equivalent ARM syntax and encoding 
STR Rd, [SP, #8 bit offset] 


31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 12 11 10 9 
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STRB Rd, [Rn, #5_bit_offset] STRB (1) 


Description This form of the STRB (Store Register Byte) instruction allows 8-bit data from , SEG Bue 
: : : immediate offset 
a general-purpose register to be stored to memory. The addressing mode is useful 
for accessing structure (record) fields. With an offset of zero, the address produced 
is the unaltered value of the base register Rn. Architecture v4T only 


In this case, STRB stores a byte from the least-significant byte of register Rd to 
memory. The memory address is calculated by adding the value of 
<5_bit_offset> to the value of register Rn. 





15 14 13 12 11 10 6 5 3 2 0 
; ' . iain pom fm 
Operation <address> = Rn + 5 _bit_offset 


Memory [<address>,1] = Rd[7:0] 
Exceptions Data Abort 


Qualifiers None 


Equivalent ARM syntax and encoding 
STRB Rd, [Rn, #5_bit_offset] 


3130 29 28 27 26 25 24 23 22 21 20 19 16 15 12,11 :10 9 
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Z 


STRB (2) 


Store byte 
register offset 


Thumb 


Architecture v4T only 


STRB Rd, [Rn, Rm] 


Description This form of the STRB (Store Register Byte) instruction allows 8-bit data from 
a general-purpose register to be stored to memory. The addressing mode is useful 
for pointer + large offset arithmetic (use the MOV immediate to set the offset), and 
for accessing a single element of an array. 


In this case, STRB stores a byte from the least-significant byte of register Rd 
to memory. The memory address is calculated by adding the value register Rm 
to the value of register Rn. 


15 14 13 12 11 10 9 8 6 5 3 2 0 
Operation <address> = Rn + Rm 
Memory [<address>,1] = Rd[7:0] 


Exceptions Data Abort 


Qualifiers None 


Equivalent ARM syntax and encoding 
STRB Rd, [Rn, Rm] 


31 30 29 28 27 26 25 24 23 22 21 20 19 16 15 12 11 10 9 


6-74 
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STRH Rd, [Rn, #5_bit_offset] | STRH (1) | 


Description This form of the STRH (Store Register Halfword) instruction allows 16-bit data from iia ieee t 
a general-purpose register to be stored to memory. The addressing mode is useful 
for accessing structure (record) fields. With an offset of zero, the address produced 
is the unaltered value of the base register Rn. Architecture v4T only 


In this case, STRH stores a halfword from the least-significant halfword of register 
Rd to memory. The memory address is calculated by adding 2 times the value of 
<5_bit_offset> to the value of register Rn. If the address is not 
halfword-aligned, the result is UNPREDICTABLE. 


15 14 13 12 11 10 6 5 3 2 0 
Operation <address> = Rn + (5_bit_offset * 2) 
<data> = Rd 
if <address>[1:0] == 0 
Memory [<address>,2] = <data>[15:0] 
else 
Memory [<address>,2] = UNPREDICTABLE 


Exceptions Data Abort 
Qualifiers None 


Notes Alignment: If an implementation includes a System Control Coprocessor 
(see Chapter 7, System Architecture and System Control Coprocessor), and 
alignment checking is enabled, an address with bit[0] != 0 will cause 
an alignment exception. 


Equivalent ARM syntax and encoding 
STRH Rd, [Rn, #5_bit_offset] 


3130 29 28 27 26 25 24 23 22 21 20 19 16 15 12,11 10 9 
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= STRH(2)_ STRH Rd, [Rn, Rm] 


one Description This form of the STRH (Store Register Halfword) instruction allows 16-bit data from 
a general-purpose register to be stored to memory. The addressing mode is useful 
for pointer + large offset arithmetic (use the MOV immediate to set the offset), and 
Architecture v4T only accessing a single element of an array. 


In this case, STRH stores a halfword from the least-significant halfword of register 
Rd to memory. The memory address is calculated by adding the value of register 


Rm to the value of register Rn. If the address is not halfword-aligned, the result is 
UNPREDICTABLE. 


15 14 13 12 11 10 9 


8 6 5 3 2 0 


Operation <address> = Rn + Rm 
<data> = Rd 
if <address>[1:0] == 0 
Memory [<address>,2] = <data>[15:0] 
else 


Memory [<address>,2] = UNPREDICTABLE 
Exceptions Data Abort 
Qualifiers None 


Notes Alignment: If an implementation includes a System Control Coprocessor (see 
Chapter 7, System Architecture and System Control Coprocessor), and 
alignment checking is enabled, an address with bit[0] != 0 will cause 
an alignment exception. 


Equivalent ARM syntax and encoding 
STRH Rd, [Rn, Rm] 


31 30 29 28 27 26 25 24 23 22 21 20 19 16 15 12 11 8 7 6 5 4 3 2 1 O 





6-76 ARM Architecture Reference Manual 


ARM DUI 0100B 





SUB Rd, Rn, #<3_bit_immediate> | SUB(1) | 


Description This form of the SUB (Subtract) instruction subtracts a small constant value from Subtract immediate 
the value of a register and stores the result in a second register. 


In this case, SUB subtracts the value of the 3-bit immediate (values 0 to 7) from 
the value of register Rn, and stores the result in the destination register Rd. 
The condition code flags are updated (based on the result). 


Architecture v4T only 








Operation = Rn - <3_bit_immed> 
N ee = Rd[31] 
Z Flag = if Rd == 0 then 1 else 0 
C Flag = NOT BorrowFrom(Rn — <3_bit_immed>) 
V Flag = OverflowFrom(Rn -— <3_bit_immed>) 


Exceptions None 


Qualifiers None 


Equivalent ARM syntax and encoding 
SUBS Rd, Rn, #<3_bit_immediate> 


3130 29 28 27 26 25 24 23 22 21 20 19 16 15 12°11 10 9 
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SUB (2) SUB Rd, #<8 bit _immediate> 


Subtract 


; 4 Description This form of the SUB (Subtract) instruction subtracts a large constant value from 
large immediate 


the value of a register and stores the result back in the same register. 


In this case, SUB subtracts the value of the 8-bit immediate (values 0 to 255) from 
the value of register Rd, and stores the result back in the register Rd. 
The condition code flags are updated (based on the result). 


Architecture v4T only 


15 14 13 12 11 10 8 7 0 
Operation Rd = Rd - <8_bit_immed> 
N Flag = Rd[31] 
Z Flag = if Rd == 0 then 1 else 0 





C Flag = NOT BorrowFrom(Rd —- <8_bit_immed>) 
V Flag = OverflowFrom(Rd - <8_bit_immed>) 


Exceptions None 


Qualifiers None 


Equivalent ARM syntax and encoding 
SUBS Rd, Rd, #<8_bit_immediate> 


31 30 29 28 27 26 25 24 23 22 21 20 19 16 15 12:11 10 9 
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SUB Rd, Rn, Rm | SUB (3) | 


This form of the SUB (Subtract) instruction subtracts the value of one register from Subtract register 
the value of a second register and stores the result in a third register. 


In this case, SUB subtracts the value of register Rm from the value of register Rn, , 
and stores the result in the destination register Rd. The condition code flags are PANIsit ete BTR) ec S07 
updated (based on the result). 


15 14 13 12 11 10 9 8 6 5 3 2 0 
Operation Rd = Rn - Rm 
N Flag = Rd[31] 
Z Flag = if Rd == 0 then 1 else 0 





C Flag = NOT BorrowFrom(Rn — Rm) 
V Flag = OverflowFrom(Rn — Rm) 


Exceptions None 


Qualifiers None 


Equivalent ARM syntax and encoding 
SUBS Rd, Rn, Rm 


31.30 29 28 27 26 25 24 23 22 21 20 19 16 15 42 AA: “AD2.59) 
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SUB (4) 


Decrement 
stack pointer 


Thumb 


Architecture v4T only 


SUB, SP, SP, #<7_bit_immediate> 


Description This form of the SUB (Subtract) instruction is used to increase the size of the stack. 


In this case, SUB subtracts the value of the 7-bit immediate (values 0 to 127) and 
the value of the SP, and stores the result back in the SP. 





1 0 1 1 0 0 0 0 1 7_bit_immediate 


Operation SP = SP - <7_bit_immed> 


Exceptions None 


Qualifiers None 


Equivalent ARM syntax and encoding 
SUB SP, SP, #<7_bit_immediate> 


31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 
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SWI <8_bit_immediate> | sv Loe 


Description The SWI instruction is used as an operating system service call. It can be used in Software Interrupt 
two ways: 


1 Uses the 8-bit offset to indicate the OS service that is required. 


2 Ignores the 8-bit field and indicates the service required with 
a general-purpose register. 


Architecture v4T only 


A SWI exception is generated, which is handled by an operating system to provide 
the requested service; see 2.5 Exceptions on page 2-6. 


1 1 0 1 1 1 1 1 8 bit_immediate 





Operation R14_svc = PC 
SPSR_svc = CPSR 
CPSR[7] = 0 ; begin ARM execution 
CPSR[4:0] = 0b10011 ; enter Supervisor mode 
CPSR[7] = 1 ; disable IRQ 
PC = 0x08 


Exceptions None 


Qualifiers None 


Equivalent ARM syntax and encoding 
SWI <8 bit_immediate> 


31.30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 


pit afistsfocoavovceceoaveg, 8_bit_immediate 
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a 1st TST Rn, Rm 


Test bits Description The TST (Test) instruction is used to determine if many bits of a register are all 
clear, or if at least one bit of a register is set. 


TST performs a comparison by logically ANDing the value of register Rm from 


AURIS OSA SANT the value of register Rd. The condition code flags are updated (based on the 


result). 
15 14 13 12 11 10 9 8 7 6 5 3 2 0 
Operation <alu_out> = Rn AND Rm 
N Flag = <alu_out>[31] 
Z Flag = if <alu_out> == 0 then 1 else 0 





C Flag = UNAFFECTED 
V Flag = UNAFFECTED 


Exceptions None 


Qualifiers None 


Equivalent ARM syntax and encoding 
TST Rn, Rm 


31 30 29 28 27 26 25 24 23 22 21 20 19 16 15 12 11 10 9 
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This chapter describes ARM system architecture, and the system control 


processor. 
7.1 Introduction 7-2 
7.2 CP15 Access 7-2 
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7.1. Introduction 
Implementations of the ARM architecture optionally incorporate: 


* — on-chip Memory Management Unit (MMU) (including Translation Lookaside 
Buffer(s) (TLB)) 


¢ — Instruction and/or Data Cache (IDC) 
¢ Write Buffer (WB) 


If these functions are implemented, coprocessor 15 is used to control them. 
Coprocessor 15 is called the System Control Coprocessor or just CP15. 


The MMU incorporates a two-level page table for virtual to physical address translation, 
and access permission attributes for each virtual to physical translation. The MMU page 
tables also contain cache and write buffer enables; therefore, if a cache or a write buffer 
is implemented, the MMU must also be implemented 


7.2 CP15 Access 


CP15 defines 16 registers. CP15 registers can only be accessed with MRC and MCR 
instructions (CDP, LDC and STC instructions to CP15 will cause an undefined 
instruction trap). The CRn field of MRC and MCR instructions specify the coprocessor 
register to access, and the CRm field and opcode_2 field are used to specify a particular 
action when addressing some registers. 


28 27 26 25 24 23 21 20 19 16 15 12 11 





Opcode_1 should be zero (SBZ) for all CP15 instructions. 


If acache, MMU and Write Buffer are not implemented, CP15 will not be implemented, 
and all accesses to CP15 will cause undefined exceptions. 


Any access to CP15 while the processor is in User mode will cause an undefined 
instruction exception. 


An MRC instruction from coprocessor 15 to register 15 is UNPREDICTABLE. 


7.3  CP15 Architectures 


If a cache, MMU and Write Buffer are implemented, CP15 register 0 contains 
an architecture field that specifies a particular layout and functionality for the remaining 
registers. 


Reading from CP15 register 0 returns an architecture and implementation-defined 
identification from the processor: 


31 24 23 16 15 4 3 0 


Implementor Architecture version Part number Revision 
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contain the revision number for the processor 


Bits[3:0] 
Bits[15:4] 
Bits[23:16] contain the architecture version: 
0x00 
page 7-10) 
0x01 
below) 
Bits[31:24] 


0x41 =A 
0x44 =D 


7.4 ARMv4 System Control Coprocessor 


ARM Ltd 
Digital Equipment Corporation 


contain a 3-digit part number in binary-coded decimal format 


Version 3 (see 7.5 ARMv3 System Control Coprocessor on 
Version 4 (see 7.4 ARMv4 System Control Coprocessor 


contain the ASCII code of an implementor’s trademark: 


ARM Architecture Version 4 System Control Coprocessor is designed to control a single 
combined instruction and data cache, or separate instructions and data caches, a write 
buffer, a prefetch buffer, and a virtual to physical address translator including combined 
instruction and data TLB or separate instruction and data TLB. CP15 also controls 


various system configuration signals. 


Register | Reads 
0 ID Register 
Control 
Translation Table Base 
Domain Access Control 
UNPREDICTABLE 
Fault Status 
Fault Address 
Cache Operations 


NO of WD = 


8 TLB operations 
9to 15 UNPREDICTABLE 
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Writes 

UNPREDICTABLE 

Control 

Translation Table Base 
Domain Access Control 
UNPREDICTABLE 

Fault Status 

Fault Address 

Cache operations 

TLB operations 
UNPREDICTABLE 





Update Policy 
No update 
Read Modify Write 


Write only 
Write only 


Table 7-1: ARMv4 CP15 register summary 


7-3 


7.4.2 


Sysiem Architecture and System Contol Coprocessor 


7.4.1. Register 0: ID register 


31 24 23 16 15 4 3 0 


Coe re ete tf Faneamear | oven 


Reading from CP15 register 0 returns the implementation-defined identification for 
the processor. The CRm and opcode_2 fields are ignored when reading CP15 
register 0, and SHOULD BE ZERO. 


Bits[3:0] contain the revision number for the processor 


Bits[15:4] contain a 3-digit part number in binary-coded decimal format 
(for example, 0x810 for ARM810) 


Bits[23:16] contain the architecture version 
(for example, 0x01 = Version 4) 


Bits[31:24] contain the ASCII code of an implementation trademark 
(0x41 = A = ARM Ltd.) 


Writing to CP15 register 0 is unpredictable. 


Register 1: Control register 


31 13121110 9 8 7 6 5 4 3 2 1 


Reading from CP15 register 1 reads the control bits. The CRm and opcode_2 fields are 
IGNORED when reading CP15 register 1, and should be zero. 


Writing to CP15 register 1 sets the control bits. The CRm and opcode_2 fields are not 
used when writing CP15 register 1, and should be zero. 


All control bits are set to zero on reset. The control bits have the following functions: 


M Bit 0 Memory Management Unit (MMU) Enable/Disable 
0 = MMU disabled 
1 = MMU enabled 


A Bit 1 Alignment Fault Enable/Disable 
0 = Address alignment fault checking disabled 
1 = Address alignment fault checking enabled 


C Bit 2 Instruction and data cache Enable/Disable 
If separate instruction and data caches are implemented, this bit 
controls only the data cache enable/disable, and the | bit (bit 12) 
controls the instruction cache enable/disable. 
0 = Instruction and data cache (IDC) disabled 
1 = Instruction and data cache (IDC) enabled 


W Bit 3 Write buffer Enable/Disable 
0 = Write buffer disabled 
1 = Write buffer enabled 
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P Bit 4 


D Bit 5 


L Bit 6 


B Bit 7 


S Bit 8 


R Bit 9 


F Bit 10 
Z Bit 11 
| Bit 12 


Bits 31:13 


32-bit/26-bit Exception handlers 

Implementations that support 26-bit configurations (see Chapter 5, 
The 26-bit Architectures) use this bit to control the PROG32 signal. 
0 = 26-bit exception handlers 

1 = 32-bit exception handlers 

This bit is UNPREDICTABLE on implementations that do not support 
26-bit configurations, and should be 1. 


32-bit/26-bit data address range 

Implementations that support 26-bit data spaces use this bit to control 
the DATA32 signal (see Chapter 5, The 26-bit Architectures). 

0 = 26-bit data address checking enabled 

1 = 26-bit data address checking disabled (32-bit data addresses) 
This bit is UNPREDICTABLE on implementations that do not support 
26-bit data spaces, and should be 1. 


IMPLEMENTATION DEFINED. 


Big endian/Little endian 
0 = Little endian operation 
1 = Big endian operation 


System protection 
This bit modifies the MMU protection system. 


ROM protection 
This bit modifies the MMU protection system. 


IMPLEMENTATION DEFINED 
IMPLEMENTATION DEFINED 


Instruction cache enable/disable 

0 = Instruction cache disabled 

1 = Instruction cache enabled 

If separate instruction and data caches are implemented, this bit 
controls the instruction cache enable/disable, and the C bit (bit 2) 
controls the data cache enable/disable. If a combined instruction and 
data cache is implemented, any writes to this bit are IGNORED, and 
reads return an UNPREDICTABLE value. 


When read return an UNPREDICTABLE value, and when written SHOULD 
BE ZERO. 


Enabling the MMU 


Care must be taken if the translated address differs from the untranslated address, 

as the instructions following the enabling of the MMU will have been fetched using no 
address translation and enabling the MMU may be considered as a branch with delayed 
execution. A similar situation occurs when the MMU is disabled. The correct code 
sequence for enabling and disabling the MMU is IMPLEMENTATION DEFINED. 


If the cache and/or write buffer are enabled when the MMU is not enabled, the results 
are UNPREDICTABLE. 
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7.4.3. Register 2: Translation table base register 


31 14 13 0 


Translation Table Base UNP/SBZP 


Reading from CP15 register 2 returns the pointer to the currently active first-level 
translation table in bits[31:14] and an unpredictable value in bits[13:0]. The CRm and 
opcode_2 fields are IGNORED when reading CP15 register 2, and SHOULD BE ZERO. 
Writing to CP15 register 2 updates the pointer to the currently active first-level 
translation table from the value in bits[31:14] of the written value. Bits[13:0] must be 
written as zero. The CRm and opcode_2 fields are ignored when writing CP15 
register 2, and SHOULD BE ZERO. 


7.4.4 Register 3: Domain access control register 


31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 1110 9 8 7 6 5 4 3 2 1 


Reading from CP15 register 3 returns the value of the Domain Access Control Register. 
The CRm and opcode_2 fields are IGNORED when reading CP15 register 3, and SHOULD 


BE ZERO. 
Writing to CP15 register 3 writes the value of the Domain Access Control Register. 
The CRm and opcode_2 fields are IGNORED when writing CP15 register 3, and SHOULD 
BE ZERO. 

The Domain Access Control Register consists of sixteen 2-bit fields, each defining 
the access permissions for one of the 16 Domains (D15-D0). For the meaning of each 
field, see 7.9 Domains on page 7-23. 


7.4.5 Register 4: Reserved 
Reading and writing CP15 register 4 is unpredictable. 
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7.4.6 Register 5: Fault status register 


31 


Reading CP15 register 5 returns the value of the Fault Status Register (FSR). The FSR 
contains the source of the last data fault. Note that only the bottom 9 bits are returned. 
The upper 23 bits are unpredictable. The FSR indicates the domain and type of access 
being attempted when an abort occurred. 


Bit 8 returns zero 

Bits 7:4 specify which of the 16 domains (D15-D0) was being accessed when 
a fault occurred 

Bits 3:0 indicate the type of access being attempted. The encoding of these 
bits is shown in Table 7-9: Priority encoding of fault status on page 
7-25. 


The FSR is only updated for data faults, not for prefetch faults. The CRm and opcode_2 
fields are IGNORED when reading CP15 register 5, and SHOULD BE ZERO. 


Writing CP15 register 5 sets the Fault Status Register to the value of the data written. 
This is useful for a debugger to restore the value of the FSR. The upper 24 bits written 
should be zero (SBZ). The CRm and opcode_2 fields are IGNORED when writing CP15 
register 5, and SHOULD BE ZERO. 


7.4.7 Register 6: Fault address register 


31 0 


Fault Address 


Reading CP15 register 6 returns the value of the Fault Address Register (FAR). 

The FAR holds the virtual address of the access which was attempted when a fault 
occurred. The FAR is only updated for data faults, not for prefetch faults. The CRm and 
opcode_2 fields are IGNORED when reading CP15 register 6, and SHOULD BE ZERO. 


Writing CP15 register 6 sets the Fault Address Register to the value of the data written. 
This is useful for a debugger to restore the value of the FAR. The CRm and opcode_2 
fields are IGNORED when writing CP15 register 6, and SHOULD BE ZERO. 
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7.4.8 Register 7: Cache functions 


Function 

Flush ID cache(s) 
Flush ID single entry 
Flush | cache 

Flush | single entry 
Flush D cache 
Flush D single entry 
Clean ID cache 


Clean ID cache entry 


Clean D cache 
Clean D cache entry 


Writing to CP15 register 7 is used to control caches and buffers. 


An ARM implementation may include a combined instruction and data cache, or 
separate instruction and data caches. A write buffer, prefetch buffer and branch target 
cache may also be implemented and are controlled by this register. Several cache 
functions are defined, and the function to be performed is selected by the opcode_2 and 
CRm fields in the MCR instruction used to write CP15 register 7. 


The Flush ID functions flush (invalidate) all cache data. 


The Flush Entry functions may be implemented to flush more than a single entry, up to 
the entire cache. 


The Clean D cache functions write out dirty data held in a writeback cache. They do not 
invalidate any cached data. 


Any functions that are not relevant to a particular implementation are UNPREDICTABLE. 
All unused values of opcode_2 and CRm are UNPREDICTABLE. 


Not all functions are provided by all implementations. 


Reading CP15 register 7 is UNPREDICTABLE. 


Clean and Flush ID cache 
Clean and Flush ID entry 
Clean and Flush D cache 
Clean and Flush D entry 
Flush Prefetch Buffer 

Drain Write Buffer 

Flush Branch Target Cache 
Flush Branch Target Entry 




















opcode_2 value | CRm value | Data | Instruction 

0b000 0b0111 SBZ CR plS, O, Rl, 67, EF, O 
0b001 0b0111 IMP CR pl5, 0, Rd, c7, c7, 1 
0b000 0b0101 SBZ WOR jlS, O, Rel, e7, ES, O 
0b001 0b0101 IMP CR pl5, 0, Rd, c7, c5, 1 
0b000 0b0110 SBZ CR plS, O, Re, el, Sb, O 
0b001 0b0110 IMP CR. pl5;, 0, Rd) C7, co, 1 
0b000 0b1011 SBZ CR gilS, O, Re, ©%, Celli, © 
0b001 0b1011 IMP CR pl5, 0, Rd, c7, cll, 
0b000 0b1010 SBZ MCR pls, O, Réel, C7, ClO, © 
0b001 0b1010 IMP CR pid, (0, Rd, -c7, c10; 
0b000 0b1111 SBZ CR jlS, O, Rel, C7, S15, © 
0b001 0b1111 IMP CR pl5, 0, Rd, c7, cl5, 
0b000 0b1110 SBZ CR pul, O, Rl, e7, Ela, © 
0b001 0b1110 IMP CR pl5, 0, Rd, c7, cl4, 
0b100 0b0101 SBZ CR pls, ©, Rel, e7, ©5, 4 
0b100 0b1010 SBZ CR pl5, 0, Rd, c7, cl0, 4 
0b110 0b0101 SBZ MCR jlS, O, Rel, Cl, CBp 6 
0b111 0b0101 IMP CR: (pl'd,. 0, Rd}. C7, Co) 7 
































Table 7-2: Cache functions 
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7.4.9 Register 8: TLB functions 


Writing to CP15 register 8 is used to control Translation Lookaside Buffers (TLBs). 

An ARM implementation may include a combined instruction and data TLB, or separate 
instruction and data TLBs. Several TLB functions are defined, and the function to be 
performed is selected by the opcode_2 and CRm fields in the MCR instruction used to 
write CP15 register 6. 


Not all functions are provided by all implementations. 
The Flush ID functions flush (invalidate) all TLB data. 


The Flush | and Flush D functions are intended for use on implementations with split 
instruction and data TLBs; if used on an implementation with a combined TLB, 
the behaviour is as if a Flush ID function was used. 


The Flush Entry functions may be implemented to flush from more than a single entry 
to the entire TLB. 


Any functions that are not relevant to a particular implementation are UNPREDICTABLE. 
All unused values of opcode_2 and CRm are UNPREDICTABLE. 


Reading CP15 register 8 is UNPREDICTABLE. 

Function opcode_2 | CRm Data Instruction 

Flush ID TLB(s) 0b000 0b0111 | SBZ CR pilS, 0, Rel 68, C7, 0 
Flush ID single entry | 0b001 0b0111 | Virtual Address CR pl5, 0, Rd, ¢8, cy, 1 
Flush | TLB 0b000 0b0101 | SBZ GR pilS, O, Rl, <8, 5, 0 
Flush | single entry 0b001 0b0101 | Virtual Address CR pl5, 0, Rd, c8, c5, 1 
Flush D TLB 0b000 0b0110 | SBZ CR jolS, O, Réel, eS, C6, O 
Flush D single entry | 0b001 0b0110 | Virtual Address CR pl5, 0, Rd, c8, c6, 1 


7.4.10 11-15: Reserved 
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Table 7-3: TLB functions 


Accessing (reading or writing) any of these registers is unpredictable. 
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7.5 ARMv3 System Control Coprocessor 


The ARM Architecture Version 3 System Control Coprocessor is designed to control 
a single combined instruction and data cache, a write buffer, and a virtual to physical 
address translator including combined instruction and data TLB. 


CP15 also controls various system configuration signals: 


Register | Reads Writes 
0 ID Register UNPREDICTABLE 
1 UNPREDICTABLE | Control 
2 UNPREDICTABLE | Translation Table Base 
3 UNPREDICTABLE | Domain Access Control 
4 UNPREDICTABLE | UNPREDICTABLE 
5 Fault Status Flush TLB 
6 Fault Address Flush TLB Entry 
7 UNPREDICTABLE | Flush Cache 
8to 15 UNDEFINED UNDEFINED 








Table 7-4: ARMv3 CP15 register summary 


7.5.1 Register 0: ID register 


31 24 23 16 15 4 3 0 


niet easy of Pannier | ation | 


Reading from CP15 register 0 returns an architecture and IMPLEMENTATION DEFINED 
identification for the processor. The CRm and opcode_2 fields are IGNORED when 
reading CP15 register 0, and SHOULD BE ZERO. 


Bits[3:0] contain the revision number for the processor 


Bits[15:4] contain a 3 digit part number in binary coded decimal format 
(for example, 0x700 for ARM700) 


Bits[23:16] contain the architecture version 
(for example, 0x00 = Version 3) 


Bits[31:24] contain the ASCII code of an implementor’s trademark 
(for example, 0x41 = A = ARM Ltd.) 


Writing to CP15 register 0 is unpredictable. 
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7.5.2 Register 1: Control Register 


31 


1110 9 8 7 6 5 4 3 2 1 


Reading from CP15 register 1 is UNPREDICTABLE. 


Writing to CP15 register 1 sets the control bits. The CRm and opcode_2 fields are 
IGNORED when writing CP15 register 1, and SHOULD BE ZERO. 


All control bits are set to zero on reset. The control bits have the following functions: 


M Bit 0 


A Bit 1 


C Bit 2 


W Bit 3 


P Bit 4 


D Bit 5 


L Bit 6 


B Bit 7 


S Bit 8 


R Bit 9 


F Bit 10 
Bits 31:11 
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Memory Management Unit (MMU) Enable/Disable 
0 = MMU disabled 
1 = MMU enabled 


Alignment Fault Enable/Disable 
0 = Address alignment fault checking disabled 
1 = Address alignment fault checking enabled 


Instruction and data cache Enable/Disable 
0 = Instruction and data cache (IDC) disabled 
1 = Instruction and data cache (IDC) enabled 


Write buffer Enable/Disable 
0 = Write buffer disabled 
1 = Write buffer enabled 


32-bit/26-bit Exception handlers 

Implementations that support 26-bit configurations use this bit to 
control the PROG32 signal (see Chapter 5, The 26-bit Architectures) 
0 = 26-bit exception handlers 

1 = 32-bit exception handlers 


32-bit/26-bit data address range 

Implementations that support 26-bit data spaces use this bit to control 
the DATA32 signal (Chapter 5, The 26-bit Architectures) 

0 = 26-bit data address checking enabled 

1 = 26-bit data address checking disabled (32-bit data addresses) 


IMPLEMENTATION DEFINED. 


Big endian/Little endian 
0 = Little endian operation 
1 = Big endian operation 


System protection 
This bit modifies the MMU protection system. 


ROM protection 
This bit modifies the MMU protection system. 


IMPLEMENTATION DEFINED 


When read, these bits return an UNPREDICTABLE value; 
when written, SHOULD BE ZERO. 


ARM Architecture Reference Manual iM 


Sysiem Architecture and System Contol Coprocessor 


Enabling the MMU 

Care must be taken if the translated address differs from the untranslated address, 

as the instructions following the enabling of the MMU will have been fetched using no 
address translation, and enabling the MMU may be considered as a branch with delayed 
execution. A similar situation occurs when the MMU is disabled. 


The correct code sequence for enabling and disabling the MMU is IMPLEMENTATION 
DEFINED. 


If the cache and/or write buffer are enabled when the MMU is not enabled, the results 
are UNPREDICTABLE. 


7.5.3 Register 2: Translation Table Base Register 


31 14 13 0 


Translation Table Base UNP/SBZ 


Reading from CP15 register 2 is UNPREDICTABLE. 


Writing to CP15 register 2 updates the pointer to the currently active first-level 
translation table from the value in bits[31:14] of the written value. Bits[13:0] must be 
written as zero. The CRm and opcode_2 fields are IGNORED when writing CP15 
register 2, and SHOULD BE ZERO. 


7.5.4 Register 3: Domain Access Control Register 


31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 1110 9 8 7 6 5 4 3 2 1 


Reading from CP15 register 3 is UNPREDICTABLE. 


Writing to CP15 register 3 writes the value of the Domain Access Control Register. 
The CRm and opcode_2 fields are IGNORED when writing CP15 register 3, and SHOULD 
BE ZERO. 


The Domain Access Control Register consists of sixteen 2-bit fields, each of which 
defines the access permissions for one of the sixteen Domains (D15-D0). 
For the meaning of each field, see 7.9 Domains on page 7-23. 

7.5.5 Register 4: Reserved 


Reading and writing CP15 register 4 is UNPREDICTABLE. 
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7.5.6 Register 5: Fault Status Register and Flush TLB 


31 


Reading CP15 register 5 returns the value of the Fault Status Register (FSR). The FSR 
contains the source of the last data fault. Note that only the bottom 9 bits are returned. 

The upper 23 bits are UNPREDICTABLE. The FSR indicates the domain and type of access 
being attempted when an abort occurred: 


Bit 8 returns zero 


Bits 7:4 specify which of the 16 domains (D15-D0) was being accessed when 
a fault occurred 


Bits 3:0 indicate the type of access being attempted 


The encoding is shown in Table 7-9: Priority encoding of fault status on page 7-25. 
The FSR is only updated for data faults, not for prefetch faults. The CRm and opcode_2 
fields are IGNORED when reading CP15 register 5, and SHOULD BE ZERO. 


Writing CP15 register 5 flushes the TLB. An ARMv3 implementation may only include 
a combined instruction and data TLB, and not separate instruction and data TLBs. 
The data written to the register is IGNORED, and SHOULD BE ZERO. 


7.5.7 Register 6: Fault Address Register and Flush TLB Entry 


31 0 


Fault address 


Reading CP15 register 6 returns the value of the Fault Address Register (FAR). 

The FAR holds the virtual address of the access which was attempted when a fault 
occurred. The FAR is only updated for data faults, not for prefetch faults. The CRm and 
opcode_2 fields are IGNORED when reading CP15 register 6, and SHOULD BE ZERO. 


Writing CP15 register 6 flushes a single entry from the TLB. An ARMv3 implementation 
may only include a combined instruction and data TLB, and not separate instruction and 
data TLBs. The data written to the register is the virtual address to be flushed. 


7.5.8 Register 7: Flush Cache 


Reading CP15 register 7 is UNPREDICTABLE. 

Writing to CP15 register 7 is used to flush the instruction and data cache. An ARMv3 
implementation may only include a combined instruction and data cache, and not 
separate instruction and data caches. The data written to the register is IGNORED, and 
SHOULD BE ZERO. 


7.5.9 Registers 8-15: Reserved 


Accessing (reading or writing) any of these registers will cause an undefined instruction 
exception. 
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7.6 
7.6.1 


Memory Management Unit (MMU) Architecture 


Overview 


The ARM MMU performs two primary functions: 
* — it translates virtual addresses into physical addresses 
* — it controls memory access permissions 
The MMU hardware required to perform these functions consists of: 
* atleast one Translation Lookaside Buffer (TLB) 
* access control logic 
* — translation-table-walking logic 


For implementations with separate Instruction and Data caches, separate TLBs for 
instruction and data are also likely. 


The translation lookaside buffer 


The TLB caches virtual to physical address translations and access permissions for 
each translation. If the TLB contains a translated entry for the virtual address, the access 
control logic determines whether access is permitted. If access is permitted, the MMU 
outputs the appropriate physical address corresponding to the virtual address. If access 
is not permitted, the MMU signals the CPU to abort. 


If the TLB misses (it does not contain a translated entry for the virtual address), 
the translation table walk hardware is invoked to retrieve the translation and access 
permission information from a translation table in physical memory. Once retrieved, 
the information is placed into the TLB, possibly overwriting an existing entry. 


Memory accesses 
The MMU supports memory accesses based on sections or pages: 


Sections are comprised of 1MB blocks of memory 


Pages Two different page sizes are supported: 
small pages consist of 4kB blocks of memory 
large pages consist of 64kB blocks of memory 


Sections and large pages are supported to allow mapping of a large region of memory 
while using only a single entry in the TLB. Additional access control mechanisms are 
extended within small pages to 1kB sub-pages and within large pages to 16kB 
sub-pages. 

Translation table 

The translation table held in main memory has two levels: 


first-level table holds both section translations and pointers to second-level 
tables 


second-level tables hold both large and small page translations 
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Domains 


The MMU also supports the concept of domains. These are areas of memory that can 
be defined to possess individual access rights. The Domain Access Control Register is 
used to specify access rights for up to 16 separate domains. 


When the MMU is turned off (as happens on reset), the virtual address is output directly 
as the physical address, and no memory access permission checks are performed. 


It is UNPREDICTABLE if two TLB entries address overlapping areas of memory. This can 
occur if the TLB is not flushed after memory is re-mapped with different-sized pages 
(leaving an old mapping with different sizes in the TLB, and a new mapping gets loaded 
into a different TLB location). 


7.6.2 Translation process 


The MMU translates virtual addresses generated by the CPU into physical addresses 
to access external memory, and also derives and checks the access permission. There 
are three routes by which the address translation (and hence permission check) takes 
place. The route taken depends on whether the address in question has been marked 
as a section-mapped access or a page-mapped access; and there are two sizes of 
page-mapped access (large pages and small pages). 


However, the translation process always starts out in the same way, as described 
below, with a first-level fetch. A section-mapped access only requires a first level fetch, 
but a page-mapped access also requires a second-level fetch. 


7.6.3 Translation table base 


The translation process is initiated when the on-chip TLB does not contain an entry for 
the requested virtual address. The Translation Table Base Register points to the base 
of the first-level table. Only bits 31 to 14 of the Translation Table Base Register are 
significant; bits 13 to 0 should be zero. Therefore, the first-level page table must reside 
on a 16Kbyte boundary. 


7.6.4 First-level fetch 


Bits 31:14 of the Translation Table Base register are concatenated with bits 31:20 of 
the virtual address to produce a 30-bit address as illustrated in Figure 7-1: Accessing 
the translation table first-level descriptors on page 7-16. This address selects a 
four-byte translation table entry which is a first-level descriptor for a section or a pointer 
to a second-level page table. 
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Translation base 











Table index 


14 13 LT 210 




















Translation base Table index 00 

















Figure 7-1: Accessing the translation table first-level descriptors 


7.6.5 First-level descriptors 


7-16 


Fault 


Page Table 


Section 


Reserved 


31 


The first-level descriptor may define either a section descriptor or a pointer to 
a second-level page table and its format varies accordingly. Figure 7-2: First-level 
descriptor format shows the format, bits[1:0] indicate the descriptor type and validity. 


Accessing a descriptor that has bits[1:0] = 0600 generates a translation fault 
(see 7.10 Aborts on page 7-24). 


Accessing a descriptor that has bits[1:0] = 0611 is UNPREDICTABLE. 


20 19 12 1110 9 8 5 4 3 2 1 0 





Page table base address Domain 


Section base address 




















Figure 7-2: First-level descriptor format 
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7.6.6 Section descriptor and translating section references 


If the first-level descriptor is a section descriptor, the fields have the following meanings: 


Bits 1:0 
Bit 3:2 


Bit 4 
Bits 8:5 


Bits 9 
Bits 11:10 


Bits 19:12 
Bits 31:20 


Identify the type of descriptor (0b10 marks a section descriptor). 


The cachable and bufferable bits. See 7.7 Cache and Write Buffer 
Control on page 7-22. 


The meaning of this bit is IMPLEMENTATION DEPENDENT. 


The domain field specifies one of the sixteen possible domains for all 
the pages controlled by this descriptor. 


This bit is not currently used, and SHOULD BE ZERO. 


Access permissions. These bits control the access to the section. 
See Table 7-7: Access permissions on page 7-23 for the 
interpretation of these bits. 


These bits are not currently used, and SHOULD BE ZERO. 


The Section Base Address forms the top 12 bits of the physical 
address. 


Figure 7-3: Section translation illustrates the complete section translation sequence. 
Note that the access permissions contained in the first-level descriptor must be checked 
before the physical address is generated. The sequence for checking access 
permissions is described in 7.8 Access Permissions on page 7-22. 





Translation 
table base Translation base 

















Address of 3! 


Virtual ; 
Address Table index 





Section index 




















14 13 XZ 





first-level Translation base 

















Table index 00 





descriptor 
First-level fetch 
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des cripte | Section base address | AP | Domain ule 


Physical 
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31 20 19 124110 98 543210 
































31 | 20 19 \/ 


Section base address Section index 

















Figure 7-3: Section translation 
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7.6.7 Page table descriptor 


If the first-level descriptor is a page table descriptor, the fields have the following 


meanings: 
Bits 1:0 Identify the type of descriptor (0b01 marks a page table descriptor). 
Bit 4:2 The meaning of these bits is IMPLEMENTATION DEPENDENT. 
Bits 8:5 The domain field specifies one of the sixteen possible domains for all 
the pages controlled by this descriptor. 
Bits 9 This bit is not currently used, and SHOULD BE ZERO. 


Bits 31:10 |The Page Table Base Address is a pointer to a second-level page 
table, giving the base address for a second level fetch to be 
performed. Second level page tables must be aligned on a 1Kbyte 
boundary. 


If a page table descriptor is returned from the first-level fetch, a second-level fetch is 
initiated to retrieve a second level descriptor, as shown in Figure 7-4: Accessing the 
translation table second-level descriptors. 





Translation 
table base Translation base 














20 19 1211 





Virtual First-level Second-level 
Address table index table index 

















Address of 3! 


descriptor table index 


First-level fetch 











31 yy 10 98 5 














y Ss 
First-level Page table base address Domain 
descriptor Zz 




















Address of 31 10 9 


second-level Second-level 
descriptor Page table base address table index 


Figure 7-4: Accessing the translation table second-level descriptors 
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7.6.8 Second-level descriptor 


Fault 
Large page 
Small page 


Reserved 
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The second-level descriptor may define either a large page or a small page access. 
Table 7-5: Second-level descriptor format shows the format; bits[1:0] indicate 
the descriptor type and validity. 


Accessing a descriptor that has bits[1:0] = 0b00 generates a translation fault 
(see 7.10 Aborts on page 7-24). 


Accessing a descriptor that has bits[1:0] = 0b11 is UNPREDICTABLE. 


31 16 15 121110 9 8 765 43 2 1 «0 





Large page base address 




















Small page base address 








Table 7-5: Second-level descriptor format 
The fields in both large and small pages have the following meanings: 
Bits 1:0 identifies the type of descriptor 


Bits 2:3 The cachable and bufferable bits. See 7.7 Cache and Write Buffer 
Control on page 7-22. 


Bits 11:4 Access permissions 
These bits control access to the page. See Table 7-7: Access 
permissions on page 7-23 for the interpretation of these bits. 


Both large and small pages are split into four sub-pages: 
APO encodes the access permissions for the first sub-page 


AP1 encodes the access permissions for the second 
sub-page 


AP2 encodes the access permissions for the third sub-page 


AP3 encodes the access permissions for the fourth (last) 
sub-page 


Bits 15:12 are not currently used for large pages, and must be zero 


Bits 31:12 are used to form the corresponding bits of the physical address 
(small pages) 


Bits 31:16 are used to form the corresponding bits of the physical address 
(large pages) 
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7.6.9 Translating large page references 


7-20 


Figure 7-5: Large page translation shows the complete translation sequence for a 
64Kbyte large page. 


Note As the upper four bits of the Page Index and low-order four bits of the Second-level 


Translation 
table base 


Address of 31 


first-level 
descriptor 





Table Index overlap, each page table entry for a large page must be duplicated 16 times 
(in consecutive memory locations) in the page table. 





Translation base 














20 19 1615 1211 


Virtual First-level Second-level 
Address table index table index —_ 4] 








Page index 
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First-level fetch 











First-level Page table base address 
descriptor 
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descriptor Page table base address fable index 00 























Second-level fetch 
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Figure 7-5: Large page translation 
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7.6.10 Translating small page references 


> 
x 
4 


_ Dae) 





Figure 7-6: Small page translation shows the complete translation sequence for 
a 4Kbyte small page. 
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Figure 7-6: Small page translation 
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7.7 Cache and Write Buffer Control 


The ARM memory system is controlled by two attributes which are individually 
selectable for each virtual page: 


Cacheable This attribute indicates that data in the page may be cached, so that 
subsequent accesses may not access main memory. Cacheable also 
indicates that instruction speculative prefetching beyond the current 
point of execution may be performed. The cache implementation may 
use a write-back or a write-through policy (or a choice of either for 
individual virtual pages). 


Bufferable This attribute indicates that data in the page may be stored in the write 
buffer, allowing faster write operations for processors that operate 
faster than main memory. The write buffer may not preserve strict 
write ordering, and may not ensure that multiple writes to the same 
location result in multiple off-chip writes. 


The Cacheable and Bufferable bits in the Section and Page descriptors control caching 
and buffering. 


Meaning 

Uncached, Unbuffered 

Uncached, Buffered 

Cached, Unbuffered or Writethrough cached, Buffered 
Cached, Buffered or Writeback cached, Buffered 





Table 7-6: Cache and bufferable bit meanings 


Implementations that offer both writethrough or writeback caching use the 0b10 value to 
specify writethrough caching, and the 0b11 value to specify writeback caching. 
Implementations that only offer one type of cache behaviour (writeback or writethrough) 
use the C and B bits strictly as cache enable and write buffer enable respectively. 


Note that writeback cache implementations that do not also support writethrough 
caching, may not provide cached, unbuffered memory (as the writeback cache 
effectively buffers writes). 


7.8 Access Permissions 


The access permission bits in section and page descriptors control access to the 
corresponding section or page. The access permissions are modified by the System (S) 
and ROM (R) control bits. Table 7-7: Access permissions on page 7-23 describes the 
meaning of the access permission bits in conjunction with the S and R bits. If an access 
if made to an area of memory without the required permission, a Permission Fault is 
raised; see 7.10 Aborts on page 7-24. 
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Permissions 
AP |S R Supervisor User 
00 0 0 No Access No Access 
00 1 0 Read Only No Access 
00 0 1 Read Only Read Only 
00 1 1 UNPREDICTABLE 
01 x X Read/Write No Access 
10 Xx X Read/Write Read Only 
11 x X Read/Write Read/Write 














Table 7-7: Access permissions 


A domain is a collection of sections, large pages and small pages. The ARM 
architecture supports 16 domains; access to each domain is controlled by a 2-bit field 
in the Domain Access Control Register. Each field allows the access to an entire domain 
to be enabled and disabled very quickly, so that whole memory areas can be swapped 
in and out of virtual memory very efficiently. 


Two kinds of domain access are supported: 


Clients are users of domains (execute programs, access data), and are 
guarded by the access permissions of the individual sections and 
pages that make up the domain 


Managers _ control the behaviour of the domain (the current sections and pages 
in the domain, and the domain access), and are not guarded by the 
access permissions of individual sections and pages in the domain 

One program can be a client of some domains, and a manager of some other domains, 
and have no access to the remaining domains. This allows very flexible memory 
protection for programs that access different memory resources. Table 7-8: Domain 
Access Values illustrates the encoding of the bits in the Domain Access Control 
Register. 


Description 
Any access will generate a domain fault 


Accesses are checked against the access permission 
bits in the section or page descriptor 

Using this value has unpredictable results 

Accesses are not checked against the access 
permission bits in the section or page descriptor, 

so a permission fault cannot be generated 


Value | Access Types 
0b00 No Access 
0b01 Client 


0b10 Reserved 
0b11 Manager 








Table 7-8: Domain Access Values 
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7.10 Aborts 


7.11 


7-24 


MMU Faults 


The mechanisms that can cause the ARM processor to halt execution because of 
memory access restrictions are: 


MMU fault The MMU detects the restriction and signals the processor. 
External abort The external memory system signals an illegal memory 
access. 


Collectively, MMU faults and external aborts are just called aborts. Accesses that cause 
aborts are said to be aborted. 


If the memory request that is aborted is an instruction fetch, then a Prefetch Abort 
Exception is raised if and when the processor attempts to execute the instruction 
corresponding to the illegal access. If the aborted access is a data access, a Data Abort 
Exception is raised. See 2.5 Exceptions on page 2-6 for more information about 
Prefetch and Data Aborts. 


The MMU generates four types of faults: 
¢ — Alignment Fault 
¢ Translation Fault 
* Domain Fault 
« Permission Fault 
The memory system may abort three types of access: 
« Line Fetches 
* Memory Accesses (uncached or unbuffered accesses) 
« Translation Table Accesses 


Aborts that are detected by the MMU are stopped before any external memory access 
takes place. It is the responsibility of the external system to stop external accesses that 
cause external aborts. 


The System Control coprocessor contains two registers which are updated when a data 
access is aborted. These registers are not updated for prefetch aborts, as the aborted 
instruction may not be executed due to changes in program flow. 
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7.11.1 Fault Address Register (FAR) and Fault Status Register (FSR) 





Aborts resulting from data accesses (data aborts) are immediately acted upon by 

the CPU. The Fault Status Register (FSR) is updated with a 4-bit Fault Status (FS[3:0]) 
and the domain number of the access. In addition, the virtual address which caused the 
data abort is written into the Fault Address Register (FAR). If a data access 
simultaneously generates more than one type of data abort, they are prioritised in the 
order given in Table 7-9: Priority encoding of fault status. 


Aborts arising from instruction fetches are simply flagged as the instruction enters 
the instruction pipeline. Only when (and if) the instruction is executed does it cause 

a prefetch abort; a prefetch abort is not acted upon if the instruction is not used 

(e.g. it is branched around). Because instruction prefetch aborts may or may not be 
acted upon, the FSR and FAR are not updated (the value of the PC saved in R14_abt 
after the exception occurs can be used to calculate the fault address). 














Priority | Sources FS[3:0] | Domain[3:0] | FAR 
Highest | Terminal Exception 0b0010 | invalid IMPLEMENTATION 
DEFINED 
Vector Exception 0b0000 | invalid valid 
Alignment Ob00x1 | invalid valid 
External Abort on Translation First level | 061100 | invalid valid 
Second level | 061110 | valid valid 
Translation Section | 060101 | invalid valid 
Page | 0b0111 | valid valid 
Domain Section | 061001 | valid valid 
Page | 061011 | valid valid 
Permission Section | 061101 | valid valid 
Page | 061111 | valid valid 
External Abort on Linefetch Section | 060100 | valid valid 
Page | 060110 | valid valid 
Lowest | External Abort on Section | 061000 | valid valid 
Non-linefetch Page | 061010 | valid valid 
Table 7-9: Priority encoding of fault status 
Notes 
1 Alignment faults may write either 0b0001 or 0b0011 into FS[3:0]. 
2 Invalid values in Domain[3:0] occur because the fault is raised before a valid 
domain field has been loaded. 
3. Any abort masked by the priority encoding may be regenerated by fixing 
the primary abort and restarting the instruction. 
4 The FS[3:0] encoding for Vector Exception breaks from the pattern that FS[0] 
is zero for all external aborts. 
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7.11.2 Fault-checking sequence 


7.11.3 Vector Exceptions 


7.11.4 Alignment fault 


7.11.5 Translation fault 


7.11.6 Domain fault 


7-26 


The sequence by which the MMU checks for access faults is slightly different for 
Sections and Pages. Figure 7-7: Sequence for checking faults on page 7-27 illustrates 
the sequence for both types of access. The sections and figures that follow describe 
the conditions that generate each of the faults. 


When the processor is in a 32-bit configuration (PROG32 is active) and in a 26-bit mode 
(CPSRI[4] == 0), data access (but not instruction fetches) to the hard vectors (address 
0x0 to 0x1f) will cause a data abort, known as a vector exception. See 7.11.3 Vector 
Exceptions on page 7-26 for a full description. It is IMPLEMENTATION DEFINED if vector 
exceptions are generated when the MMU is not enabled. 


If Alignment Faults are enabled, an alignment fault will be generated on any data word 
access whose address is not word-aligned (virtual address bits [1:0] != 0b00), or any 
halfword access that is not halfword-aligned (virtual address bit[0] != 0). Alignment faults 
will not be generated on any instruction fetch, or on any byte access. 


Note that if the access generates an alignment fault, the access will be aborted without 
reference to further permission checks. It is IMPLEMENTATION DEFINED if alignment 
exceptions are generated when the MMU is not enabled. 


There are two types of translation fault: 


Section is generated if the first-level descriptor is marked as invalid. 
This happens if bits[1:0] of the descriptor are both 0. 


Page is generated if the second-level descriptor is marked as invalid. 
This happens if bits[1:0] of the descriptor are both 0. 


There are two types of domain fault: 


* Section 

* Page 
In both cases, the first descriptor holds the 4-bit Domain field which selects one of 
the sixteen 2-bit domains in the Domain Access Control Register. The two bits of 
the specified domain are then checked for access permissions as detailed in Table 
7-8: Domain Access Values on page 7-23. 


In the case of a section, the domain is checked when the first-level descriptor is 
returned, and in the case of a page, the domain is checked when the second-level 
descriptor is returned. If the specified access is marked as No Access in the Domain 
Access Control Register, either a Section Domain Fault or Page Domain Fault occurs. 
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7.11.7 Permission fault 


7.12 External Aborts 
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There are section permission faults and sub-page permission faults. 


Permission faults are checked at the same time as domain faults. If the 2-bit domain field 
returns client (01), the permission access check is invoked as follows: 


Section If the first-level descriptor defines a section access, the AP bits of 
the descriptor define whether or not the access is allowed, according 
to Table 7-7: Access permissions on page 7-23. If the access is not 
allowed, a Section Permission fault is generated. 


Sub-page If the first-level descriptor defines a page-mapped access, the 
second-level descriptor specifies four access permission fields 
(ap3, ap2, ap1, ap0) each corresponding to one quarter of the page. 
For small pages, ap3 is selected by the top 1kB of the page, and apO 
is selected by the bottom 1kB of the page. For large pages, ap3 is 
selected by the top 16kB of the page, and ap0 is selected by the 
bottom 16kB of the page. The selected AP bits are then interpreted in 
exactly the same way as for a section, (see Table 7-7: Access 
permissions on page 7-23) the only difference being that the fault 
generated is a Sub-page Permission fault. 


In addition to the MMU faults, the ARM Architecture defines an external abort pin which 
may be used to flag an error on an external memory access. However, not all accesses 
can be aborted in this way, so this pin must be used with great care. The following 
accesses may be externally aborted and restarted safely: 


* Reads 

¢ Unbuffered writes 

¢  First-level descriptor fetch 

*  Second-level descriptor fetch 

¢  Multi-obus master semaphores 
A linefetch may be safely aborted on any word in the line transfer. If the abort happens 
on data that has been requested by the processor (rather than data that is being fetched 
as the remainder of a cache line), the access will be aborted. Any data transferred that 


is not immediately accessed (the remainder of the cache line) will only be aborted when 
it is accessed. 


It is IMPLEMENTATION-DEFINED if the FAR points to the start address of the cache line, or 
the address that generated the abort. 


Buffered writes cannot be externally aborted. Therefore, the system must be configured 
such that it does not do buffered writes to areas of memory which are capable of flagging 
an external abort, or a different mechanism should be used to signal the abort 

(an interrupt for example). 


The contents of a memory location that causes an abort is UNPREDICTABLE after 
the abort. 
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7.13 System-level Issues 


This section lists a number of issues that need to be addressed by the system designer 
and operating systems to provide an ARMv4 compatible system. 


7.13.1 Memory systems, write buffers and caches 


ARMv4 processors and software expect to be connected to a byte-addressed memory. 
Word and halfword accesses to the memory will ignore the alignment of the address and 
return the naturally-aligned value that is addressed (So a memory access will ignore 
address bits 0 and 1 for word access, and will ignore bit 0 for halfword accesses). 
ARMv4 processors must implement some method for switching between big-endian 
and little-endian addressing of the memory system (if CP15 is implemented, bit 7 of 
register 1 controls endianness). It is IMPLEMENTATION DEFINED if the endianness can be 
changed dynamically. 


Memory that is used to hold programs and data will be marked as follows: 
Main (RAM) memory will normally be set as cacheable and bufferable 


ROM memory will normally be set as cacheable, and will be 
marked as read only (so the bufferable attribute is 
not used, and SHOULD BE ONE) 


Write buffers 


An ARMVv4 implementation may incorporate a merging write buffer, that subsumes 
multiple writes to the same location into a single write to main memory. Furthermore, 
a write buffer may re-order writes, so that writes are issued to memory in a different 
order to the order in which they are issued by the processor. Thus IO locations should 
never be marked as bufferable, to ensure all writes are issued, and in the correct order, 
to the IO device. 


Caches 


Frame buffers may be cacheable, but frame buffers on writeback cache 
implementations must be copied back to memory after the frame buffer has been 
updated. Frame buffers may be bufferable, but again the write buffer must be written 
back to memory after the frame buffer has been updated. 


ARMv4 does not support cache coherency between the ARM and other system bus 
masters (bus snooping is not supported). Memory data that is shared between multiple 
bus masters should be mapped as uncacheable to ensure that all reads access main 
memory, and unbufferable to ensure all writes do access main memory. IO devices that 
are mapped into the memory map should be marked as uncacheable and unbufferable. 


The coherency of data buffers that are read or written by another bus master may be 
managed in software, by cleaning data from writeback caches and write buffers to 
memory when the processor has written to the data buffer and before the other bus 
reads the buffer, and flushing relevant data from caches when the buffer is being read 
after the other bus master has written the buffer. An uncached, unbuffered semaphore 
can be used to maintain synchronisation between multiple bus masters. 

See 7.14 Semaphores on page 7-31. 
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7.13.2 Interrupts 
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For implementations with writeback caches, all dirty cache data must be written back 
before any alterations are made to the MMU page tables, to ensure that cache line write 
back may use the page tables to form the correct physical address for the transfer. 


Caches may be indexed using either virtual or physical addresses. Physical pages must 
only be mapped into a single virtual page, otherwise the result is unpredictable. ARMv4 
does not provide coherency between multiple virtual copies of a single physical page. 


Some ARM implementations support separate instruction and data caches. 

The coherency between the data and instruction cache may not be maintained in 
hardware, so if the instruction stream is written, the instruction cache and data cache 
must be made coherent. This may entail cleaning the data cache (storing dirty data to 
memory), draining the write buffer (completing all buffered writes), and flushing the 
instruction cache. Instruction and data memory incoherency occurs after a program has 
been loaded (and thus treated as data) and is about to be executed, or if self-modifying 
code is used or generated. 


ARM processors implement fast and normal levels of interrupt. 


Both interrupts are signalled externally, and many implementations will synchronise 
interrupts before an exception is raised. A fast interrupt request (FIQ) will disable 
subsequent normal and fast interrupts by setting the | and F bits in the CPSR, and 
a normal interrupt request (IRQ) will disable subsequent normal interrupts by setting 
the | bit in the CPSR. See 2.5 Exceptions on page 2-6. 


Cancelling interrupts 


It is the responsibility of software (the interrupt handler) to ensure that the cause of 

an interrupt is cancelled (no longer signalled to the processor) before interrupts are 
re-enabled (by clearing the | and/or F bit in the CPSR). Interrupts may be cancelled with 
any instruction that may make an external data bus access; that is, any load or store, 
a swap, or any coprocessor instruction. 


Cancelling an interrupt via an instruction fetch is UNPREDICTABLE. 


Cancelling an interrupt with a load multiple that restores the CPSR and re-enables 
interrupts is UNPREDICTABLE. 


Devices that do not instantaneously cancel an interrupt (i.e. they do not cancel 

the interrupt before letting the access complete) should be probed by software to ensure 
that interrupts have been cancelled before interrupts are re-enabled. This allows 

a device connected to a remote IO bus to operate correctly. 
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7.14 Semaphores 


The Swap and Swap Byte instructions have predictable behaviour when used in two 
ways: 


*  Multi-bus master systems that use the Swap instructions to implement 
semaphores to control interaction between different bus masters. 

In this case, the semaphores must be placed in an uncached and unbufferable 

region of memory. The Swap instruction will then cause a (locked) read-write 

bus transaction. 

This type of semaphore may be externally aborted. 

¢« Systems with multiple threads running on a uni-processor that use the Swap 
instructions to implement semaphores to control interaction of the threads. 

In this case, the semaphores may be placed in a cached and bufferable region 

of memory, and a (locked) read-write bus transaction may not occur. 

This system is likely to have better performance than the multi bus master 

system above. 

This type of semaphore has unpredictable behaviour if it is externally aborted. 
Semaphores placed in non-cacheable/bufferable memory regions have UNPREDICTABLE 
results. Semaphores placed in cacheable/non-bufferable memory regions have 
UNPREDICTABLE results. 
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MUL 
ARM _ 3-58 
Thumb 6-58 
multiply 
32-bit 3-10 
64-bit 3-10 
by constant 3-89 
example 4-2 
instructions 1-5 
list of 3-10, 3-11 
multiply (MUL) instruction 
ARM _ 3-58 
Thumb 6-58 
multiply accumulate (MLA) instruction 3-52 
multi-way branch 4-5, 4-7 
MVN 
ARM _ 3-59 
Thumb 6-59 


NEG 6-60 
negate (NEG) instruction 6-60 
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operands 
data-processing 3-84 
immediate 3-84 
shifted register 3-85 
shifter 3-86 
operating system 1-3, 1-4, 2-7, 3-75 
ORR 
ARM 3-60 
Thumb 6-61 
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page table descriptor 7-18 
PC. See Program Counter 
permission fault 7-28 
POP 6-62 
pop multiple registers (POP) instruction 6-62 
prefetch abort 2-6 
procedure 
calland return 4-4 
entry and exit 
example 4-8 
processor mode 1-3, 1-5 
26-bit and 32-bit 5-7 
26-bit architectures 5-2 
abort mode 2-2 
changing 2-2, 3-56 
fast interrupt mode 2-2, 2-3 
interrupt mode 2-2 
mode bits 2-5 
privileged 1-3, 2-2, 2-6 
reset 2-7 
supervisor mode 2-2 
system mode 1-3, 2-2 
undefined mode 2-2 
unprivileged modes 1-4 
user mode 2-2 
PROG32 signal 5-8 
program counter (PC) 1-2, 1-4, 2-3, 3-6 
26-bit architecture 5-2 
program status register 3-12 
26-bit architectures 5-4 
access instructions 3-12 
instructions 
list of 3-12 
PUSH 6-64 
push multiple registers (PUSH) instruction 6-64 
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register 15 

26-bit architectures 5-2 

program counter bits 5-3 

registers 

banked 1-3, 2-6 

banks of 2-3 

high 6-8 

operand value 3-85 

overview 1-2 

shifted operand value 3-85 
reset 2-6 
return address 1-6, 3-6 
reverse subtract (RSB) instruction 3-61 
reverse subtract with carry (RSC) instruction 3-62 
RISC (Reduced Instruction Set Computer) 1-2 


ROR 
ARM _ 3-95 to 3-96, 3-102, 3-105, 3-108 
Thumb 6-66 


rotate left with extend 3-30 
rotate right (ROR) instruction 
ARM 


as addressing mode 3-102, 3-105, 3-108 
immediate 3-96 
register 3-95 
Thumb 
register 6-66 
rotate right with extend (RRX) instruction 3-97 
ARM 
as addressing mode 3-105 
as addressing mode 3-102, 3-105, 3-108 
RRX_ 3-97, 3-102, 3-105, 3-108 
ARM_ 3-105 
RSB_ 3-61 
RSC 3-62 
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Saved Program Status Register. See SPSR 


SBC 
ARM _ 3-63 
Thumb 6-67 


second-level descriptor 7-19 
section descriptor 7-17 
section references 7-17 


semaphores 1-6, 7-31 
examples 4-9 
instruction list 3-19 
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shift 1-5, 3-7, 3-53 
instructions 1-5 
shifted register 1-5 
shifter operand 3-7, 3-84 
default 3-85 
register 3-85 
signals 
DATA32 5-8 
PROG32 5-8 
signed multiply accumulate long (SMLAL) instruction 
signed multiply long (SMULL) instruction 3-65 
sign-extend 1-5, 2-2 
SMLAL 3-64 
SMULL 3-65 
software interrupt (SWI) 1-3, 1-4, 2-6 
examples 4-10 


instruction 
ARM 3-75 
Thumb 6-81 


SPSR_ 1-3, 1-5, 2-3, 3-12 
26-bit architectures 5-2 

stack pointer (SP) 1-2, 1-3, 1-6, 2-3 
incrementing 6-26 

status register access instructions 3-12 


status register transfer instructions 1-5 
See also MRS and MSR instructions 


status registers. See program status register 
STC 3-66 


STM 
ARM_ 3-67 to 3-68 
Thumb 6-68 


store coprocessor (STC) instruction 3-66 
store multiple (STM) instruction 
ARM _ 3-67 to 3-68 
Thumb 6-68 
store register (STR) instruction 
ARM _ 3-69 
Thumb 
immediate 6-70 
register 6-71 
SP-relative 6-72 
store register byte (STRB) instruction 
ARM _ 3-70 
Thumb 
immediate 6-73 
register 6-74 
store register byte with translation (STRBT) instruction 
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store register halfword (STRH) instruction 
ARM _ 3-72 
Thumb 
immediate 6-75 
register 6-76 
store register with translation (STRT) instruction 3-73 
STR 
ARM _ 3-69 
Thumb 6-70 to 6-72 
STRB 
ARM _ 3-70 
Thumb 6-73 to 6-74 
STRBT 3-71 
STRH 
ARM _ 3-72 
Thumb 6-75 to 6-76 
string compare 4-6 
STRT 3-73 
SUB 
ARM 3-74 
Thumb 6-77 to 6-80 
subroutine 
call 1-2, 2-3, 3-6, 3-33 
calland return 1-6 
return 3-53 
return address 1-6 
subtract (SUB) instruction 
ARM 3-74 
Thumb 
decrement stack pointer 6-80 
immediate 6-77 
large immediate 6-78 
large register 6-79 
subtract with carry (SBC) instruction 
ARM _ 3-63 
Thumb 6-67 
swap byte (SWPB) instruction 3-77 
swap word (SWP) instruction 3-76 
swapping 
byte order (endianness) 4-3 
register and memory values 1-6 
SWI 
ARM _ 3-75 
Thumb 6-81 
SWP_ 3-76 
SWPB_ 3-77 
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system control coprocessor 7-2 
ARM version 3 7-10 
ARM version 4 7-2 
See also coprocessor 15 
system mode 1-3 
See processor mode, system mode 


T 


TEQ 3-78 
TEQP (26-bit test equivalence) instruction 5-4 to 5-5, 
5-8 


test (TST) instruction 
ARM _ 3-79 
Thumb 6-82 
test equivalence (TEQ) instruction 3-78 
THUMB 6-12 
architecture 6-1 
ARM code execution 2-5 
branch instructions 6-5 
list of 6-6 
code 1-4, 3-35 
code execution 2-5 
data-processing instructions 6-7 
list of 6-8, 6-100 
instruction set 1-4, 6-1 
overview 6-4 
load and store instructions 
examples 6-13 
list of 6-13 
load and store multiple instructions 6-14 
examples 6-14 
list of 6-15 
T flag 2-5, 3-35 
translating 
large page references 7-20 
section references 7-17 
small page references 7-21 
translation fault 7-26 
translation lookaside buffers 7-14 
architecture version 3 7-13 
coprocessor 15 7-9 
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flush functions 7-9 
translation table 7-14 
base 7-15 
base register 
architecture version 3 7-12 
coprocessor 15 7-6 


TST 
ARM _ 3-79 
Thumb 6-82 


TSTP (26-bit test) instruction 5-4 to 5-5, 5-8 


U 


UMLAL 3-80 
UMULL 3-81 


undefined instruction 1-3, 2-6 
extension space 3-27 


unsigned multiply accumulate long (UMLAL) instruction 
3-80 


unsigned multiply long (UMULL) instruction 3-81 
user mode 1-2, 1-3 

context switch 4-13 

See also processor mode, user mode 
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variables 4-5 
vector exception 5-8, 5-9, 7-26 
vectors 1-3, 2-6 


virtual memory 1-3 


Ww 


word 1-5 
write buffer 7-2, 7-22, 7-29 
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zero-extend 1-5, 2-2 
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